This section describes how you can get started at developing DataFusion.
For information on developing with Ballista, see the Ballista developer documentation.
DataFusion is written in Rust and it uses a standard rust toolkit:
cargo build
cargo fmt
to format the codecargo test
to testBelow is a checklist of what you need to do to add a new scalar function to DataFusion:
BuiltinScalarFunction
FromStr
with the name of the function as called by SQLreturn_type
with the expected return type of the function, given an incoming typesignature
with the signature of the function (number and types of its arguments)create_physical_expr
mapping the built-in to the implementationunary_scalar_expr!
macro for the new function.pub use expr::{}
set.Below is a checklist of what you need to do to add a new aggregate function to DataFusion:
Accumulator
and AggregateExpr
:
BuiltinAggregateFunction
FromStr
with the name of the function as called by SQLreturn_type
with the expected return type of the function, given an incoming typesignature
with the signature of the function (number and types of its arguments)create_aggregate_expr
mapping the built-in to the implementationThe query plans represented by LogicalPlan
nodes can be graphically
rendered using Graphviz.
To do so, save the output of the display_graphviz
function to a file.:
// Create plan somehow...
let mut output = File::create("/tmp/plan.dot")?;
write!(output, "{}", plan.display_graphviz());
Then, use the dot
command line tool to render it into a file that
can be displayed. For example, the following command creates a
/tmp/plan.pdf
file:
dot -Tpdf < /tmp/plan.dot > /tmp/plan.pdf
We formalize Datafusion semantics and behaviors through specification documents. These specifications are useful to be used as references to help resolve ambiguities during development or code reviews.
You are also welcome to propose changes to existing specifications or create new specifications as you see fit.
Here is the list current active specifications:
.md
documentWe are using prettier
to format .md
files.
You can either use npm i -g prettier
to install it globally or use npx
to run it as a standalone binary. Using npx
required a working node environment. Upgrading to the latest prettier is recommended (by adding --upgrade
to the npm
command).
$ prettier --version
2.3.0
After you've confirmed your prettier version, you can format all the .md
files:
prettier -w {ballista,datafusion,datafusion-examples,dev,docs,python}/**/*.md
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。