TDSM 1.3
Data.gov is a collection of large datasets from various fields. Some interesting fields with their dataset names are:
- College Scorecard [1]. This data collects certain important information of all the universities and colleges in the United States. Some of this information includes:
- The latitude and longitude of where the institute is located. - The Male/Female ratio in the institute. - The average Salary of Undergraduates and Graduates after pass out. - Average SAT and GRE scores of the admitted candidates.
This information can be used to build many interesting applications, some of which are: - To what extent the SAT/GRE score decides the salary that a student will be offered after graduation. - What factors decide the graduation salary of a student? Is it the location of the institute, the avg. SAT/GRE, the faculty-student ratio? - Does the presence of a university in a region affect its economy?
- American FactFinder Dataset [2]. This dataset is a collection of population, economic, geographic and housing information of several regions in the United States. Interesting questions that can be answered with this dataset are:
- Is the economy of a region dependent on a particular race? - Which region contributes most to the economy of United States and what factors is this contribution dependent upon? - How is the average family income dependent on the demographic distribution of a region?
- Crime data [3]. This dataset contains a listing of all the crimes from 2001 to present happened in the city of Chicago. A lot of interesting insights can be generated if we combine this data with the demographic data of Chicago. Some interesting questions worth exploring are:
- How crime affects the economy of the area? - What causes crime to increase in a certain locality? Education, average family income? - What factors are negatively related to crime and will the crime decrease if we pay emphasis on these factors that curb crime.