Paper Notes: F1 Query: Declarative Querying at Scale
F1 Query solves similar use cases as Presto, covered in the previous post and has a similar design. Google has several data stores, and data for a single application might be stored in multiple data stores. The initial use case for F1 Query was to provide a federated query engine, that can query multiple data sources. F1 claims its main strength as the ability to execute queries across different spectrums, ranging from interactive queries that completes in tens of milliseconds, to queries that takes a few hours, using the same query language and the engine.
To achieve this, F1 Query has two execution modes.
Interactive execution mode.
Batch execution mode.
The mode of execution is selected by the client. Interactive execution mode executes everything in memory, and has two variants. One variant executes everything in a single machine and the second one does a distributed execution with different fragments (or partitions) reading non overlapping subset of the data. Data is mostly kept in memory for interactive mode. Design for interactive mode is very similar to the design of Presto.
In Batch mode, Query executor executes multiple map-reduce programs to complete the query. This mode materializes intermediate results to disk. As of publishing this paper in 2018, this is being migrated to data flow. This is very similar to “Presto on Spark”, where Spark was used as the execution framework for long-running Presto queries.
F1 Query supports SQL 2011 standard and has been extended to better support nested records, arrays and protocol buffer.
The paper also covers extensibility. In addition to embedded execution of UDFs, it also supports UDF servers. UDF servers are stateless, and supports scalar functions, aggregate functions and table valued functions. F1 Query calls makes UDF calls in a batched, and pipelined fashion. A remote store can be expressed as a table valued function with no input tables. It uses bi-directional streaming for table valued functions, and has pending input rows and output rows.
One of the design principles of F1 is to not have “cliffs” in performance, i.e. it does not switch to a component that is drastically different when a certain data threshold is met. An example of this is only spilling data that won’t fit in memory to disk for lookup joins, instead of using a completely disk based structure.
Paper lists some production numbers. F1 query runs on 1000s on machines in each data centers. Latency for queries range from 10s of ms for interactive queries to few hours for batch mode queries.

