MapReduce Alorithm Design part2

CS/클라우드 컴퓨팅

MapReduce Alorithm Design part2

춤추는수달 2021. 11. 11. 23:26

Serialization

Writables : 하둡 serialization foramt

not easy extend, java only / compact, fast

inputFormat

textinputformat : default, key value

keyvaluetextinputformat : 라인을 탭으로 쪼개서 key, value 구분

outputFormat

Hadoop job = hadoop mapreduce program

task attempt = running task instance

workflow

job submission

RecordReader, Partitioner, RecordWriter

Configuration parameter ->via job configuration object 중요

distributed cache -> 모든 데이터 노드에 로컬 카피 존재

context object : 하둡 시스템과 인터렉션(리포트 프로그레스, 잡 컨피규레이션 밸류 가져오기 등)

Hadoop Runtime System : Scheduling, data distribution, synchronization, errors and faults

Hadoop2

local Aggregation : 중간 데이터 줄이기, combiner과 차이는 개발자가 직접 관여할 수 있음. State 저장

Mapper Histogram : 한 줄에 두 번 이상 나타나는건 합쳐서 계산

preserving state : 전체 인풋에서 유니크한 단어만 출력. 계속 상태 저장해두다가 끝날 때 내보냄

in mapper combining : 개발자가 mapper에서 직접 저장공간을 활용해 combing 하는것

'CS > 클라우드 컴퓨팅' 카테고리의 다른 글

Data Flow Languages Part3 (0)	2021.11.25
4-2 MapReduce Algorithm Design part3 (0)	2021.11.18
Mapreduce Algorithm Design (0)	2021.11.05
MapReduce소개 (0)	2021.10.21
하둡 실습 (0)	2021.10.14

현재글MapReduce Alorithm Design part2

코달

뚝딱뚝딱

WBCS, 백준 1167, 트리의 지름, setsockopt, 백준, protobuf, MBCS, BFS, lv3, resource monitor, 프로그래머스, Directx12, 코딩테스트, PDH, 미로만들기, 알고리즘, C++, 퍼즐 조각 채우기, ResourceMonitor, 크루스칼,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

코달