English Proficiency Data

A summary of the Census English Proficiency Data specification.

1.0 Introduction

English Proficiency datasets are reported once per year by the United States Census. The data reflect persons' abilities to speak English. The data sets range from 2015 - to 2020 and have been aggregated to census tracts.

Data reported by the US census, the measures for English speaking ability are in reference to the diversity of languages spoken within the United States. This data counts the population of English speaking sorted out by ability.

English Proficiency Categories

Across the United States, persons may speak many more languages outside of English. Collecting data on the ability level of speakers of English provides insights into the diversity of the United States.

Metrics used to report on English language speaking abilities:

  • Very Well

  • Well

  • Not Well

  • Not at All

  • Native Speaker

Census English Proficiency Parameters: Census.gov

2.0 Data Specification

The following table is the format in which downloaded Vehicle Availability data will be provided:

Field

Type

Description

geo_id

String

14-digit code relating the data to the correct geolocation.

census_tract_code

String

The unique identifier provided by the US Census for each census tract.

sub_category

String

Level of English speaking.

unit

String

Total count of persons.

value

Float

Number of persons fitting within survey criteria.

year

String

Year data was collected.

county_name

String

Name of the county that the census tract resides in.

state_name

String

Name of the state that the census tract resides in.

geometry

String

The GIS information required for the computer to read the mapping file.

πŸ’‘Tip: The β€œGEO.ID” field contains 14-digit codes that identify the summary level of data, the geographic component of the data, and FIPS codes that uniquely identify the data. For example, the 14-digit β€œGEO.ID” for Harris County, TX is β€œ0500000US48201” where β€œ050” represents the summary level of the data, β€œ0000” represents the 2-digit geographic variant, and the 2-digit geographic component, β€œUS” represents the United States, β€œ48” represents the state of Texas and β€œ201” represents Harris County.