Software firms dataset about diversification and interdependence

doi: 10.4121/7349e277-d28c-48e6-953b-93e61654ef00.v1
The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
doi: 10.4121/7349e277-d28c-48e6-953b-93e61654ef00
Datacite citation style:
Vlas, Cristina (2023): Software firms dataset about diversification and interdependence. Version 1. 4TU.ResearchData. dataset.
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite
usage stats
time coverage
cc-0.png logo CC0

We start by identifying U.S.-based software organizations in the computer programming and data processing industry (SIC 737), as a knowledge-intensive high-growth setting. We integrate two main data sources. First, to collect the knowledge-based measures, we use publicly available data provided by the U.S. Patent and Trademark Office (USPTO). Using the General Architecture for Text Engineering (GATE) software, we design queries that retrieve the complete class and subclass information for each patent, as well as citations, inventors, and total patents granted between 1998 and 2011 inclusive. We aggregate the data by organization-year observation at the class and subclass levels and use these aggregated measures to compute the knowledge-based predictors and covariates. To compute moving averages for some variables, we collect five years of additional USPTO data which makes our knowledge dataset span between 1993 and 2011. Second, we use Compustat to collect organization-level control variables such as assets, number of employees, market valuation, R&D expenditures, intangibles, solvency, and slack. The integration of the two datasets yields a final sample panel of 100 organizations with 3.2 years of observations on average per organization from 1998 to 2011.

  • 2023-09-13 first online, published, posted
University of Massachusetts, Isenberg School of Management


files (1)