Importing and managing datasets in Datawizz
Datawizz let’s you create datasets of LLM logs - either from requests recorded on DataWizz endpoints or from logs uploaded to the platform. You can use these datasets to train and evaluate logs.
Datasets can be helpful for creating consistent sets of data to train with, or to import existing data from other systems.
To create a dataset, navigate to the Datasets
tab in the Datawizz dashboard and click on the Create Dataset
button. You can choose a name for your new dataset and a description to help you remember what it’s for.
Once inside your dataset, you can click Import Logs
to import logs from Datawizz endpoints. You can use the filters there to narrow down the logs that’ll be imported.
You can also import logs from a CSV file. Click on Import CSV
and select a file to upload. The file must include input
and output
columns, where input
is the message(s) sent to the model and output
is the response.
Datawizz support two formats for the input
and output
columns:
content
field and a role
field. For example:content
field and a role
field. For example:Here are some ways to import logs from other systems. If you are using a system that is not listed here, you can still import logs by converting them to the CSV format described above - please share your use case with us so we can better document the process!
We have created a notebook that exports the logs from Humanloop to a Datawizz dataset CSV file. You can find the notebook here.
You can export logs from Langfuse to and import them into Datawizz. Datawizz supports the native export format from Langfuse making this simple.
See our video tutorial for a full walkthrough of the process.