
- Multiple source connectors, each with their own pre-processing instructions.
- 1 Embed connector through which all the data extracted from the sources will pass
- 1 Sink connector to which all the embeds generated will be stored to.
See the full selection of sources, embeds and sinks below
Pipeline intialization
Pipeline init
If you have more than one source, ensure you design the metadata outputted by the source carefully. If the sources output different metadata properties depending on the sink this might lead to error or vectors in an index that don’t share metadata properties. This can be challening at retrieval time.
Running a pipeline
This will trigger the extraction of data from the data sources, transformation using the defined pre-processing steps and the loading of data into the vector store defined.- Local
- Cloud
Search a pipeline
This will query the pipeline’s sink for documents stored in vector representation.- Local
- Cloud