Replies: 3 comments 8 replies
-
Thanks for posting. A couple thoughts:
3.Once you've done that, you should filter the recap-document API, not the docket entry API, and just loop through those items. I think if you do that, you'll fix your is_available=false problem. The Web UI idea is a good one. I'd love a one-click button that just downloaded a whole case. Buuuuut, we haven't gotten there yet! |
Beta Was this translation helpful? Give feedback.
-
I read through some of the underlying code, and as far as I can see, it's not actually possible to example a DocketEntry with a single RecapDocument and tell whether the attachment page has ever been processed. See here: courtlistener/cl/recap/mergers.py Line 1331 in f242381 The processing of an attachment page with no attachments doesn't appear to do anything to the DocketEntry or to anything else in the database. Ordinarily (in the absence of errors and in the absence of anyone using But
I think the information could be fished out of Should I open an issue about this? It could be fixed (with some effort for back-populating the data) by adding a field to DocketEntry indicating whether (and from when?) an attachment page is present. |
Beta Was this translation helpful? Give feedback.
-
The plot thickens a bit. As far as I can tell from brief inspection, request_type==3 requests for documents that actually have attachments work. requests for documents that don't have attachments fail with I think it's unfortunate that by malice or merely by accident (such as running the original version of my script) load data into RECAP that obscures the potential existence of attachments. But at least I did get request_type == working, and I'm running a little throttled job to backfill this data for the case that I've been querying. |
Beta Was this translation helpful? Give feedback.
-
Hi all-
I want to fetch all documents associated with a docket from PACER and contribute it to RECAP (and I'm happy to pay for it!). After getting some hints from @mlissner, I've gotten this far:
And I have a couple minor issues so far:
recap_documents__is_available==false
works very poorly. If there's a document with two attachments, and all three of the docs in question are unavailable, I get three duplicate docket-entries. I assume I'm querying Django wrong, but I would have expected Django to give an error instead of messing it up MySQL-style. I worked around it.And a major issue: I'm doing request-type 2 requests, which works fine for documents where the attachment list is already in RECAP and for documents that don't have attachments, but it does not work for documents that have attachments unknown to RECAP. And I don't know how to distinguish these cases. Any hints?
A request-type to ask for a document if it's a document and the attachment list if it's not (which is what I get if I just click 'Buy on PACER' would help. So would some way to distinguish what's going on from the docket-entry result.
Also, could Courtlistener maybe add a web UI for this? Maybe with a hint that anyone who's about to spend a couple hundred dollars downloading a docket might also donate some money to courtlistener?
Beta Was this translation helpful? Give feedback.
All reactions