-
Notifications
You must be signed in to change notification settings - Fork 1
Citation Parsing
ryanfreckleton edited this page Mar 21, 2013
·
1 revision
- A New Approach towards Bibliographic Reference Identification, Parsing and Inline Citation Matching
- A structural SVM approach for reference parsing
- ParsCit: An open-source CRF reference string parsing package
- Stochastic Context-Free Grammars -- Haven't been applied to this problem yet (as far as I know).
- Conditional Random Fields -- This is the approach of ParsCit, FreeCite, etc. They tend to have about 80% accuracy.
- Crowd Sourcing Parsing -- Using a game or Mechancial Turk like system, parse and tag some citations.
- Programming Contests -- Once we have a large data set of outputs and inputs, we can start asking for programmers to solve the problem in the broader internet community.
- Parsley will occasionally get stuck in infinite loops when it parses a null string.
- Appropriate git branching is necessary to keep main clean and collaborate.
- Parsley Grammars can call arbitrary python code, this hasn't fully been taken advantage of yet.