| 일 | 월 | 화 | 수 | 목 | 금 | 토 |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | |
| 7 | 8 | 9 | 10 | 11 | 12 | 13 |
| 14 | 15 | 16 | 17 | 18 | 19 | 20 |
| 21 | 22 | 23 | 24 | 25 | 26 | 27 |
| 28 | 29 | 30 | 31 |
- 문자열
- 부스트캠프
- 코세라
- softeer
- programmers
- 코테
- 클린코드 파이썬
- Java
- 클린코드
- 자바
- Data Science
- 알고리즘
- data science methodology
- 소프티어
- 파이썬
- 데이터과학
- Python
- 오블완
- IBM
- AI Mathematics
- string
- Clean Code
- 데이터사이언스
- 깨끗한 코드
- Boostcamp AI
- 티스토리챌린지
- Coursera
- 데이터 사이언스
- 코딩테스트
- 프로그래머스
- Today
- Total
떼닝로그
Data Science Methodology - From Problem to Approach and From Requirements to Collection (2) 본문
Data Science Methodology - From Problem to Approach and From Requirements to Collection (2)
떼닝 2023. 12. 27. 07:43Data Science Methodology
From Requirements to Collection
Data Requirements
From Requirements to Collection
- Data Requirements : What are data requirements?
- Data Collection : What occurs during data collection?
Case Study : Selecting the cohort
Define and select cohort: (cohot : 집단)
- inpatient within health insurance provider's service area
- primary diagnosis of CHF (Congestive Heart Failure) in one year
- Continuous enrollment for at least 6 months prior to primary CHF admission
- Disqualifying conditions
Case Study : Defining the data
Contents, formats, representations suitable for decision tree classifier:
- one record per patient with columns representing variables (dependent variable and predictors)
- Content covering all aspects of each patient's clinical history (transactional format, transformations required)
Data Collection
Case Study : Gathering available data
Available data sources:
- corporate data warehouse (single source of medical & claims, eligibility, provider, and member information)
- inpatient record system
- claim payment system
- disease management program information
Case Study : Deferring inaccessible data
Data wanted but not available:
- pharmaceutical records
- decided to defer (defer : 미루다, 연기하다)
Case Study : Merging data
- eliminate redundant data
- can discuss various ways to better manage their data
Practice Quiz : From Requirements to Collection
Q. Select the statement that describes what happens during the Data Requirements stage
A. Data Scientists identify the necessary data content, formats, and sources for initial data collection
Q. Who determines how to collect and prepare the data?
A. Data Scientists
Q. Which of the following statements is correct?
A. Data scientists determine how to collect the data.
Data scientists identify the data that is required for data modeling.
Data scientists determine how to prepare the data.