Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(interactive): Add support for loading graph with odps table as d…
…ata source (#3305) - Implement `ODPSFragmentLoader` to support loading vertex/edge from odps table: `odps://project/table_name` - As `ODPSFragmentLoader` and `CSVFragmentLoader` all read raw data into `arrow::table`, we extract the common part to form an `AbstractArrowFragmentLoader`, which provides the following interface. ```c++ class IRecordBatchSupplier { public: // Will be called until GetNextBatch() returns NullPtr. virtual std::shared_ptr<arrow::RecordBatch> GetNextBatch() = 0; }; class AbstractArrowFragmentLoader : public IFragmentLoader { void AddVerticesRecordBatch( label_t v_label_id, const std::vector<std::string>& input_paths, std::function<std::shared_ptr<IRecordBatchSupplier>( label_t, const std::string&, const LoadingConfig&)> supplier_creator); void AddEdgesRecordBatch( label_t src_label_id, label_t dst_label_id, label_t edge_label_id, const std::vector<std::string>& input_paths, std::function<std::shared_ptr<IRecordBatchSupplier>( label_t, label_t, label_t, const std::string&, const LoadingConfig&)> supplier_creator); }; ``` The `ODPSFragmentLoader` and `CSVFragmentLoader` just inherit this abstract class and call `AddVerticesRecordBatch` and `AddEdgesRecordBatch` with lambda function indicate how to procedure `RecordBatch` from each `input_path`. For `CSVFragmentLoader` we procedure RecordBatches with arrow readers; For `ODPSFragmentLoader`, we produce RecordBatches with ODPSReadClient. - For customized FragmentLoader, use can specify the path to lib via `FLEX_OTHER_LOADERS` and call `Register()` function when customized FragmentLoader class is initialized. For example, builtin `CSVFragmentLoader` is registered for `scheme=file` and `format=csv`. ```c++ const bool CSVFragmentLoader::registered_ = LoaderFactory::Register( "file", "csv", static_cast<LoaderFactory::loader_initializer_t>(&CSVFragmentLoader::Make)); ``` - ODPS related code is placed at `flex/third_party/odps/include`. These code will be opensourced by odps team but not yet ready, we just copy to here to make our `ODPSFragmentLoader` works. After the `odps-cpp-sdk` is opensourced, we will replace the dependency with a git submodule. Fix #3396
- Loading branch information