Skip to content
ryanfreckleton edited this page Mar 21, 2013 · 1 revision

Previous Research

Possible Approaches

  • Stochastic Context-Free Grammars -- Haven't been applied to this problem yet (as far as I know).
  • Conditional Random Fields -- This is the approach of ParsCit, FreeCite, etc. They tend to have about 80% accuracy.
  • Crowd Sourcing Parsing -- Using a game or Mechancial Turk like system, parse and tag some citations.
  • Programming Contests -- Once we have a large data set of outputs and inputs, we can start asking for programmers to solve the problem in the broader internet community.

Lessons Learned So Far

  • Parsley will occasionally get stuck in infinite loops when it parses a null string.
  • Appropriate git branching is necessary to keep main clean and collaborate.
  • Parsley Grammars can call arbitrary python code, this hasn't fully been taken advantage of yet.
Clone this wiki locally