Using Python to extract information from SEC website and analyze data (preliminary code available)
$30-250 USD
Pagado a la entrega
I engaged a freelancer to approach the task described below but the job is incomplete. Please revise his/her code, download the data and calculate the measure for me.
**************************************************************************************
To construct my sample of MD&As, I use a python script to download all 10-Qs from the Securities and Exchange Commission (SEC) Electronic Data-Gathering, Analysis, and Retrieval (EDGAR) system website filed. I then extract the MD&A section after removing all HTML tags from the filing. The script then searches the text of MD&As for relevant terms and expressions to compute various measures of disclosure described below. PROPFORWARD is the logarithm of one plus the proportion of words that indicate forward-looking information. Words indicating forward-looking information, identified in Li (2010), which I use to compute PROPFORWARD , are “will,” “should,”“can,”“could,”“may,”“might,”“expect,”“anticipate,”“believe, ”“plan,”“hope,”“intend,”“seek,”“project,”“forecast,”“objective,”and“goal.”Following Li (2010), I exclude the words “expected,” “anticipated,” “forecasted,” “projected,” and “believed” when they follow “was,” “were,” “had,” and “had been,” because situations such as these typically indicate a sentence that is not forward-looking in nature. PROPNUMERIC is the logarithm of one plus the proportion of numeric terms in the text.
Nº del proyecto: #13191305