You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now our scraper is downloading all the relevant staff reports for its search parameters and uploading those documents to our shared Google Drive.
The problem is there is no verification if the file already exists on the Google Drive. If the scraper is run twice for the same date range, it will upload duplicates of all the staff reports. Google Drive does not prohibit duplicate files or file names.
Our GoogleDrive_upload needs to be updated to include some check to see if a particular filename already exists on the Google Drive in the specified city folder.
Tasks:
Write a new function in GoogleDrive_upload to check if a given filename exists in the current_city folder. The function should return True / False.
Add the filename_check function as a condition to the Legistar_Selenium scraper to prevent uploads of duplicate filenames
The text was updated successfully, but these errors were encountered:
Right now our scraper is downloading all the relevant staff reports for its search parameters and uploading those documents to our shared Google Drive.
The problem is there is no verification if the file already exists on the Google Drive. If the scraper is run twice for the same date range, it will upload duplicates of all the staff reports. Google Drive does not prohibit duplicate files or file names.
Our GoogleDrive_upload needs to be updated to include some check to see if a particular filename already exists on the Google Drive in the specified city folder.
Tasks:
The text was updated successfully, but these errors were encountered: