Batch files upload
You can upload your own files such as PDFs, TXT documents, and other textual formats to Bigdata.com to analyze them and extract valuable insights.
Once uploaded, your files are automatically indexed, making their content searchable and accessible through both the search and Chat endpoints.
The script below makes it easy to upload multiple files from a local directory using parallel threads, ideal for batch uploads.
If your browser displays the python text instead of downloading it. You can press ctrl+s after the file opens.
Script parameters
workdir
: Absolute path to the work directory. For instance/home/user/workdir_batch_01
upload_txt_filename
: Text file containing the absolute path of the files to upload, this file must be inside the above work directory. For instance:file_list.txt
max_concurrency
: The number of concurrent threads to upload files
How to run the script
- Follow Prerequisites instructions to set up the require environment
- Add all the files that you want to upload in a directory, for
instance in
/home/user/files_to_upload
- Create the work directory, for instance
/home/user/workdir_batch_01
- In the work directory, create a txt file containing the absolute
path of all files to upload, for instance
file_list.txt
- Finally you can run the script
The script will generate two files:
-
Logging file: Contain details about the upload process. For instance:
bigdata_processing_20241026_002610.log
-
CSV file with IDs: Enumerate the IDs of the uploaded files so you can manage (Delete, download, etc) them in the future. The CSV file contains the following values:
file_id
: File identifier that we can use in future requests to download or delete the uploaded filesupload_status
: Status of the upload. It can beUPLOAD_DONE
orUPLOAD_ERROR
original_absolute_file_path
: The absolute path of the uploaded files
Example of the file uploaded_file_ids_20241026_002611.csv