• About Dangerous DBA
  • Table of Contents
Dangerous DBA A blog for those DBA's who live on the edge

Category Archives: Qcon

QCon London Day 1

March 7, 2016 11:15 pm / Leave a Comment / dangerousDBA

So this is not a run of the mill conference for a DBA but in my evolving role then it is good to get out there and experience a conference that is a little different from the normal data conferences that I go to. Qcon is primarily aimed at software developers and has been running for about 10 years and takes place all over the world now. This first day I took exclusively the “Stream Processing @ Scale” track apart from the keynotes as this is the focus of the project at work at the moment.

QCon London Day 1: Keynote – UNEVENLY DISTRIBUTED by Adrian Colyer

Adrian reads a paper a day, summerises it and then publishes it on “the morning paper” no mean feat and I admire him for the dedication this must take! This was an interesting keynote that raised the virtues of reading a paper a day because:

  1. They are great thinking tools – They get you to think about what people are doing and what you could do or try
  2. They raise your expectations – Make your solutions better, or what you think you should be getting as a solution
  3. They give you real life lessons you can learn from – Read about what people have implemented, or given to the community to implement for themselves
  4. They are a great conversation – Can see how ideas progress through time, who has built what on top of what
  5. They are unevenly distributed – Across subjects

Basically read more papers you will know more stuff and you will be more awesome in your job, bringing more to the table for your employer and yourself.

QCon London Day 1: Talks

PATTERNS OF RELIABLE IN-STREAM PROCESSING @ SCALE – Alexey Kharlamov

This was a rather short talk but interesting and started a theme for the day, which has left me with a question that as of yet has not been fully answered. Alexey went through all the different patterns that the company that he works for have gone through to process data in streams. They had tried LAMBDA and KAPPA and were working towards something else now but that was not eluded too.

Learn’t
  • Need event time as well as event capture time for proper windowing. This reinforces what we see at my worplace and validates everything else that is out there

STREAM PROCESSING WITH APACHE FLINK – Robert Metzger

New product in a way that will get its full release tomorrow (08/03/2016). It is promising to completely subsume batch by allowing windowing over “large” timescales by utilising in memory and disk persisted aggregations as well as a host of other interesting features that other systems do not offer.

Learn’t
  • Google Dataflow is being made into an Apache incubator project called Apache Beam

MICROSERVICES FOR A STREAMING WORLD – Ben Stopford

This is a brave new world and its a world where things that you (I) would traditionally use databases for a job (lookup values) you can now use variations of the open source streaming projects. This talk looked at an addon for Apache Kafka called KStreams that allow you to persist the latest version of a key so that it could be use by a micro service in combination with a stream to create other services. We also need to embrace decentralisation.

Learn’t
  • KStreams can be used to make KTables that can be joined with data from a stream to enable querying for a micro service
  • Kafka has compacted tables that allow you to store the latest value for a key if you so wish!

STREAMING AUTO-SCALING IN GOOGLE CLOUD DATAFLOW – Manuel Fahndrich

This talk seemed like quite a long explanation of the planning that goes into the “auto scaler” to make its decision on to scale or not. Interesting seeing some of the formulas, but in practice as this is removed from the user through the developer console then this was for informational purposes only. Manuel also went through some of the challenges that they had not solved yet e.g. Quanta. Quanta is where as you downsize the number of machines (and the virtual disks that sit behind them) you end up with an uneven distribution and therefore other machines can only process at the rate of the machine with the least disks.

Learn’t
  • Google are still going to make money from you even if you enable auto scaling, just maybe not as much!
  • Quanta

DATA STREAMING OPEN SPACE

This was not very well subscribed with there being only me, the facilitator and three other participants. I was hoping for some more people to be there to learn about what people are doing have tried and warnings of what to avoid! As it turns out I was the 2nd most experienced with streaming there, which at our infancy of usage is slightly worrying about what the rest of the world is doing.

Learn’t
  • Streaming is new and not many people are sharing, if they are doing it.

REALTIME STREAM COMPUTING &ANALYTICS @UBER – Sudhir Tonse

Good to see what a disruptive tech company is doing and seeing that they are building tools because they can’t find any to support their needs.

QCon London Day 1: Evening Keynote – BLT: BABBAGE LOVELACE TURING (SO WHO DID INVENT THAT COMPUTER?) John Graham-Cumming & Sydney Padua

Found this a little long and not quite the content I was hoping for; thinking drunk histories. Was interesting and well put together, but pondered some points too much. The talk took you through the Uber tech stack for producing data, processing data, storage of the data, querying and consumption.

Learn’t
  • Ubers world is hexagons
  • There are loads of tools out there; that come out all the time; use the one that best suits your needs at the time you need it. Change only when there is a better one, not just because it is two weeks later

Questions From Today

  1. Why do people use Apache X Y and Z and manage all of that themselves rather than using an “autoscaling solution” such a GCS?
  2. Why if there are so many people that are using Apache X Y and Z are there not more people talking about it in production apart from large “disruptive” organisations such as UBer?
  3. Why if we have the ability to output this data to so many different (heterogeneous) stores are there very few (any) tools that pull it all together again?
Posted in: 2016, Conference, QCon, Streaming / Tagged: 2016, GCS, Google, Kafka, Kstreams, London, QCon, Stream processing, Stream processing at scale

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 757 other subscribers

Recent Posts

  • Self generating Simple SQL procedures – MySQL
  • Google Cloud Management – My Idea – My White Whale?
  • Position Tracker – The Stub – Pandas:
  • Position Tracker – The Stub
  • Position Tracker – In the beginning
  • Whats been going on in the world of the Dangerous DBA:
  • QCon London Day 1
  • Testing Amazon Redshift: Distribution keys and styles
  • Back to dangerous blogging
  • DB2 10.1 LUW Certification 611 notes 1 : Physical Design

Dangerous Topics

added functionality ADMIN_EST_INLINE_LENGTH Bootcamp colum convert data types DB2 db2 DB2 Administration DB2 Development db2advis db2licm Decompose XML EXPORT GCP Google IBM IBM DB2 LUW idug information centre infosphere IOT LOAD merry christmas and a happy new year Position Tracking python Recursive Query Recursive SQL Reorganisation Reorganise Reorganise Indexes Reorganise Tables Runstats sql statement Stored Procedures SYSPROC.ADMIN_CMD Time UDF User Defined Functions V9.7 V10.1 Varchar XML XML PATH XMLTABLE

DangerousDBA Links

  • DB2 for WebSphere Commerce
  • My Personal Blog

Disclaimer:

The posts here represent my personal views and not those of my employer. Any technical advice or instructions are based on my own personal knowledge and experience, and should only be followed by an expert after a careful analysis. Please test any actions before performing them in a critical or nonrecoverable environment. Any actions taken based on my experiences should be done with extreme caution. I am not responsible for any adverse results. DB2 is a trademark of IBM. I am not an employee or representative of IBM.

Advertising

© Copyright 2023 - Dangerous DBA
Infinity Theme by DesignCoral / WordPress