Complex Noun Phrases and IPC Codes
by Gregory Grefenstette, Research Scientist, CEA List
Project
For each patent available, extract simple and complex noun phrases (of two words or more) and store them with the IPC codes of the patent, and the section heading of the text (claims, description, etc.). These phrases would then be sorted and counted. Some statistics of vocabulary specificity per IPC code would then be calculated. The resulting IPC vocabulary lists might then be useful for other natural language processing tasks, such as automatic IPC class classification or IPC-specific OCR correction.
Project Partners
Contact
- Gregory Grefenstette (Gregory.Grefenstette_at_cea.fr)

