'분류 전체보기' 카테고리의 글 목록 (7 Page)

dangling indices 문제(index 가 지워지지 않는 현상)

index에 문제가 생겨 해당 네임으로 다시 인덱스 생성(mapping 부터) 할 시에 이전 index가 계속 나오는 현상.

index delete 후에도 index 가 계속 나옴. 업데이트 됐다고 하는데 일부버전에서 문제가 계속 되나 보다.

로컬 gateway 의 recovery에서 dangled index 를 계속 올리는 현상인듯 하다.

관련 url ) https://github.com/elastic/elasticsearch/issues/2067

1) index delete

$ curl -XDELETE 'http://localhost:9200/indexname/'

해당 해결 방법

elasticsearch.yml 에

gateway.local.auto_import_dangled : no

설정을 주고 es 재시작 후 index 생성한다.

'프로그래밍 > ElasticSearch' 카테고리의 다른 글

[Elasticsearch] Custom analyzer plugin 만들기 (0)	2015.05.28
[Elasticsearch] 샤드, 레플리카 갯수 설정 (0)	2015.05.28
Elasticsearch cluster health check (0)	2015.05.15
elasticsearch query, filter 사용 차이 (0)	2015.05.07
Lucene(elasticsearch) score 계산 (0)	2015.03.26

//

MR(mapreduce) with java

하둡에서 데이터를 추출하는 가장 기본적인? 수동적인 방법인 mapreduce 를 java 로 구현한다.

하둡의 기본 설치 및 configuration은 생략 맵과 리듀스 구조 하둡 기본 커맨드 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html //

fs 파일 io 와 관련된 커맨드 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html ex) hadoop fs -mkdir '/tmp/temp' //

job 실행

ex)hadoop jar temp.jar

Job 구조

클라이언트(mr 잡을 실행하는 머쉰)에서 job을 시작하게 되면 소스에서 지정한 하둡설정 파일 혹은 직접 선언한 클러스터의 namenode에게 파일리스트, 위치를 요청하고 접근 허가를 얻는다. 각 데이터노드들에서 map작업이 시작되는데 이 작업은 Mapper 클래스의 map함수를 overide 구현한다. 이 과정에서 row단위로 읽어들여 텍스트를 key value 쌍으로 만들어준다. 이때 key value 쌍으로 만든다는 것은. id별 방문 횟수를 카운트 한다고 할 때, ['laesunk', 1] ['laesunk',1] 이런식으로 각 데이터 노드가 갖고 있는 raw데이터에서 라인별로 읽으며 id와 value를 단순 맵 형태로 만들어 줌을 이야기 한다. 이것은 맵 작업이 다 끝나게 되면 로컬로 파일로 쓰고 리듀서가 http 통신으로 가져간다. 이 사이에 셔플링, 컴바이닝 등을 통해 튜닝을 해볼 수 있다. 리듀서에서는 mapper로 부터 나오는 결과중 같은 key들을 합칩니다. 여러 mapper에서 온 데이터중 같은 key를 갖는 데이터가 있을테니 그것들을 합치면 key, iterable 의 형태로 만들게 된다. 그리고 각 input pair 마다 reduce 함수가 돌게 되는데 이때 value들은 iterable 한 형태이므로 추가 작업등을 더 추가 할 수도 있다.

JOBmain

import java.io.IOException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class testjob{
	protected static Job job;
	protected static JobConf conf;
	
	private static String HDFS_PREFIX = "hadoophost";
	private static String MAP_TASKS_CONF_KEY = "mapred.map.tasks";
	private static String REDUCE_TASKS_CONF_KEY = "mapred.reduce.tasks";


	public static void main(String[] args) throws IOException {
		// TODO Auto-generated method stub
		conf = new JobConf();
		conf.addResource(new Path("mapred-site.xml"));
		conf.addResource(new Path("core-site.xml"));
		conf.addResource(new Path("hdfs-site.xml"));   //하둡 설정파일들 등록   여기서 하둡 데이터노드들 등의 환경을 읽어 온다.  버퍼 사이즈등등.  소스에서 세부적 옵션 셋업 코드 가능

		conf.set(MAP_TASKS_CONF_KEY, "12");     
		conf.set(REDUCE_TASKS_CONF_KEY, "3");			

		conf.setJar(args[0]);  //  파라미터로 jar 페키지 파일을 지정해준다.
	
		job = new Job(conf);  
		job.setJobName("jobname");
		job.setInputFormatClass(TextInputFormat.class);
		job.setOutputFormatClass(TextOutputFormat.class);
		job.setMapperClass(testMapper.class);            //맵퍼 클래스 아래 소스의 클래스  지정
		job.setReducerClass(testReducer.class);
		job.setOutputKeyClass(Text.class);    //   맵퍼에서 나갈때 키값 타입 
		job.setOutputValueClass(IntWritable.class);    // 맵퍼에서 나갈때 벨류 타입 
		
		try{		
			FileInputFormat.addInputPath(job,new Path("input path "));  // 이부분에 하둡에 들어있는 데이터 위치를 지정해준다. 
			FileOutputFormat.setOutputPath(job, new Path("ouputpath")); //이부분에는 맵리듀스 잡으로 부터 나오는 결과 파일의 위치를 지정해준다. 이역시 하둡 상에 위치
		}catch(IOException e){
			e.printStackTrace();
		}

		try {
			job.waitForCompletion(true);
			Thread.sleep(10000);
		} catch (ClassNotFoundException e){
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (InterruptedException e){
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
	}
}

하둡에서는 int ,String의 타입을 쓰지않고 그것을 자체구현한 writable 클래스를 상속한 intwritable 따위의 객체로 쓴다. 직렬화를 위해 serializable을 구현해서 상속받은 것이다.

mapper source

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class testMapper extends Mapper{
	private Text keyVar = new Text();
	private IntWritable valueVar = new IntWritable();
	private String kvDelimeter = "\";
	
	
	protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{
		String data = value.toString();		
		try{
	                String[] datas = data.split(kvDelimeter);
			String name=datas[0];
			String cnt = datas[1];
	
			keyVar.set(userId);
			valueVar.set(Integer.parseInt(count));
			context.write(keyVar,valueVar);

		}catch(Exception e){
			e.printStackTrace();
		}
	}
}

맵에서 나온 데이터들은 각 키별로 소팅되어 reducer로 입력되어진다. 아래는 키값과 벨류리스트를 들어오면서 키에 벨류들의 총 합을 구하는 내용이다.

reducer source

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.Reducer.Context;

public class testReducer extends Reducer {

	private Text keyVar = new Text();

	private IntWritable valueVar = new IntWritable();

	protected void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException 


	{ 

	int count=0;

		for(IntWritable value: values)

		{ 

			count+=value.get(); 

		}

		keyVar.set(key); valueVar.set(count); 

		context.write(keyVar,valueVar); 

	}

}

'프로그래밍 > Hadoop ETC' 카테고리의 다른 글

[Hadoop] hdfs에 데이터 올라가는 과정 (0)	2015.08.21
[Hadoop] OOZIE 사용 (with hive, sqoop) (0)	2015.06.17
[Hadoop] hive 파티션 설정, external table, datatype (0)	2015.06.13
hive 조회 결과 파일로 떨구기 (0)	2015.06.13
MR job log 파일 뽑기 (0)	2015.03.26

//

MR job log 파일 뽑기

http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#logs

Dump the container logs

  Usage: yarn logs <options>

COMMAND_OPTIONS	Description
-applicationId ApplicationId	Specify an application id
-appOwner AppOwner	Specify an application owner
-containerId ContainerId	Specify a container id
-nodeAddress NodeAddress	Specify a node address

example) yarn logs -applicationId job1238547128312

'프로그래밍 > Hadoop ETC' 카테고리의 다른 글

[Hadoop] hdfs에 데이터 올라가는 과정 (0)	2015.08.21
[Hadoop] OOZIE 사용 (with hive, sqoop) (0)	2015.06.17
[Hadoop] hive 파티션 설정, external table, datatype (0)	2015.06.13
hive 조회 결과 파일로 떨구기 (0)	2015.06.13
MR(mapreduce) with java (0)	2015.03.28

//

Lucene(elasticsearch) score 계산

참고 자료 :http://lucene.apache.org/core/2_9_4/api/core/org/apache/lucene/search/Similarity.html

요소항목 설명

 -Term Frequency(tf) : 쿼리하는 키워드가 몇번 나왔는지에 대한 수치이다. 나온 수치의 루트값을 이용한다. 
        tf(t in d) = √frequency
    -Inverse Document Frequency(idf) : 역빈도 내용인다. 전체 document에서 해당 term이 얼마나 나왔는지를 측정. 다수의 문서에서 나온다면 이 term은 중요하지 않은 것을 간주하고 낮은 weight를 갖는다.
        idf(t) = 1 + log ( numDocs / (docFreq + 1)) 
	-field length norm : term의 중요도를 측정하는 수치이다. 해당 document에 term의 갯수가 적을 때 더 큰 weight를 갖는다. term의 갯수의 루트 값을 이용한다. 
		ex) query string : 로이조, 로이조 주말 핫녹(high), 로이조 주말 핫 녹화 방송(low) <둘을 대상으로 했을때 상대적인>        
		norm(d) = 1 / √numTerms
    -queryNorm : 두개 이상의 term으로 쿼리할때 두개의 쿼리에 대한 normalize를 위함이다. 두개의 term으로 쿼리할 경우 query:{ bool:{should:[{term:{field:'aaa'}},{term:{field:'bbb'}}] } 로 
                 검색하게 된다. 이 과정에서 두개의 쿼리로 쪼개지고 두 단어에 대한 상대적인 차이(관계)를 구하기 위함이다. term이 1개일 경우 의미 없다. 
                 =1 / √sumOfSquaredWeights    /     sumOfSqueredWeights = Sigma(t in q)(idf(t) * getBoost(t in q))^2 // 

    -coord : 검색 term이 많이 hit될 수록 보상하는 수치. 검색된 문서에서 쿼리의 Term이 몇 개 들어있는지에 대한 값. (3개단어로 검색쿼리 했고 2개 단어가 맞아 떨어짐. 해당 단어에 대한 점수  * 2/3)
    -t.getboost : 쿼리시에 넣을 수 있는데 더 높은 가중치를 줘서 검색하게 됨. queryNorm 계산시에 들어감.

계산 되는 순서 및 로직

 1) 각 term단위로 fieldweight(tf * idf * fieldnorm) 을 구한다.
 2) 각 term단위로 queryNorm 을 구한다. 
 3) 위 둘을 곱한 값들을 갖고 더한다.

multi term에서 유사관계 판단

cosine similarity 관계. happy hippopotanus로 쿼리 하였을 때 happy 2, hippopotamus 5 의 weight를 갖는다고 하자. 그럼 happy 만 들어간 doc, hippopotamus 만 들어간 문서, 둘 다 들어간 문서에서 각 단어의 weight를 구해 아래와 같이 표현한다. 그리고 쿼리의 벡터와 비슷한 각도를 따지는 것이다. 아래에서는 3번의 doc이 일치했다. happy와 같은 단어는 common word 이므로 어디서나? 낮은 점수가 나올 것 이다.

ex)1) you look happy, 2) you look like hippopotamus 3)you look so happy, hippopotanus

'프로그래밍 > ElasticSearch' 카테고리의 다른 글

[Elasticsearch] Custom analyzer plugin 만들기 (0)	2015.05.28
[Elasticsearch] 샤드, 레플리카 갯수 설정 (0)	2015.05.28
Elasticsearch cluster health check (0)	2015.05.15
elasticsearch query, filter 사용 차이 (0)	2015.05.07
dangling indices 문제(index 가 지워지지 않는 현상) (0)	2015.04.06

//

git reset commit 살리기

git show-ref -h HEAD

1) 현재 해드의 커밋 포인트 보기

git fsck --lost-found

dangling commit 7c61179cbe51c050c5520b4399f7b14eec943754

git reflog
39ba87b... HEAD@{0}: HEAD~1: updating HEAD
7c61179... HEAD@{1}: pull origin master: Fast forward

2) fsck 명령으로 reset한 커밋 포인트의 해쉬를 알 수 있다.\

git merge 7c61179

3) 해당 commit 포인트로 merge 하면 reset한 commit 포인트로 복구 된다.

'프로그래밍' 카테고리의 다른 글

maven 프로젝트생성하기 (0)	2015.05.06
저장 검색의 복잡도 (0)	2015.04.08
Linux Load average에 대해서 (0)	2015.03.10
자바와 한글 인코딩(utf-8 유니코드) 문제 (0)	2015.01.08
자바스크립트 프로파일링(성능튜닝 포인트) (0)	2014.12.04

//

Linux Load average에 대해서

You might be familiar with Linux load averages already. Load averages are the three numbers shown with the uptime and top commands - they look like this:

load average: 0.09, 0.05, 0.01

Most people have an inkling of what the load averages mean: the three numbers represent averages over progressively longer periods of time (one, five, and fifteen minute averages), and that lower numbers are better. Higher numbers represent a problem or an overloaded machine. But, what's the the threshold? What constitutes "good" and "bad" load average values? When should you be concerned over a load average value, and when should you scramble to fix it ASAP?

First, a little background on what the load average values mean. We'll start out with the simplest case: a machine with one single-core processor.

The traffic analogy

A single-core CPU is like a single lane of traffic. Imagine you are a bridge operator ... sometimes your bridge is so busy there are cars lined up to cross. You want to let folks know how traffic is moving on your bridge. A decent metric would be how many cars are waiting at a particular time. If no cars are waiting, incoming drivers know they can drive across right away. If cars are backed up, drivers know they're in for delays.

So, Bridge Operator, what numbering system are you going to use? How about:

0.00 means there's no traffic on the bridge at all. In fact, between 0.00 and 1.00 means there's no backup, and an arriving car will just go right on.
1.00 means the bridge is exactly at capacity. All is still good, but if traffic gets a little heavier, things are going to slow down.
over 1.00 means there's backup. How much? Well, 2.00 means that there are two lanes worth of cars total -- one lane's worth on the bridge, and one lane's worth waiting. 3.00 means there are three lane's worth total -- one lane's worth on the bridge, and two lanes' worth waiting. Etc.

= load of 1.00

= load of 0.50

= load of 1.70

This is basically what CPU load is. "Cars" are processes using a slice of CPU time ("crossing the bridge") or queued up to use the CPU. Unix refers to this as the run-queue length: the sum of the number of processes that are currently running plus the number that are waiting (queued) to run.

Like the bridge operator, you'd like your cars/processes to never be waiting. So, your CPU load should ideally stay below 1.00. Also like the bridge operator, you are still ok if you get some temporary spikes above 1.00 ... but when you're consistently above 1.00, you need to worry.

So you're saying the ideal load is 1.00?

Well, not exactly. The problem with a load of 1.00 is that you have no headroom. In practice, many sysadmins will draw a line at 0.70:

The "Need to Look into it" Rule of Thumb: 0.70 If your load average is staying above > 0.70, it's time to investigate before things get worse.
The "Fix this now" Rule of Thumb: 1.00. If your load average stays above 1.00, find the problem and fix it now. Otherwise, you're going to get woken up in the middle of the night, and it's not going to be fun.
The "Arrgh, it's 3AM WTF?" Rule of Thumb: 5.0. If your load average is above 5.00, you could be in serious trouble, your box is either hanging or slowing way down, and this will (inexplicably) happen in the worst possible time like in the middle of the night or when you're presenting at a conference. Don't let it get there.

What about Multi-processors? My load says 3.00, but things are running fine!

Got a quad-processor system? It's still healthy with a load of 3.00.

On multi-processor system, the load is relative to the number of processor cores available. The "100% utilization" mark is 1.00 on a single-core system, 2.00, on a dual-core, 4.00 on a quad-core, etc.

If we go back to the bridge analogy, the "1.00" really means "one lane's worth of traffic". On a one-lane bridge, that means it's filled up. On a two-late bridge, a load of 1.00 means its at 50% capacity -- only one lane is full, so there's another whole lane that can be filled.

= load of 2.00 on two-lane road

Same with CPUs: a load of 1.00 is 100% CPU utilization on single-core box. On a dual-core box, a load of 2.00 is 100% CPU utilization.

Multicore vs. multiprocessor

While we're on the topic, let's talk about multicore vs. multiprocessor. For performance purposes, is a machine with a single dual-core processor basically equivalent to a machine with two processors with one core each? Yes. Roughly. There are lots of subtleties here concerning amount of cache, frequency of process hand-offs between processors, etc. Despite those finer points, for the purposes of sizing up the CPU load value, the total number of cores is what matters, regardless of how many physical processors those cores are spread across.

Which leads us to a two new Rules of Thumb:

The "number of cores = max load" Rule of Thumb: on a multicore system, your load should not exceed the number of cores available.
The "cores is cores" Rule of Thumb: How the cores are spread out over CPUs doesn't matter. Two quad-cores == four dual-cores == eight single-cores. It's all eight cores for these purposes.

Bringing It Home

Let's take a look at the load averages output from uptime:

~ $ uptime
23:05 up 14 days, 6:08, 7 users, load averages: 0.65 0.42 0.36

This is on a dual-core CPU, so we've got lots of headroom. I won't even think about it until load gets and stays above 1.7 or so.

Now, what about those three numbers? 0.65 is the average over the last minute, 0.42 is the average over the last five minutes, and 0.36 is the average over the last 15 minutes. Which brings us to the question:

Which average should I be observing? One, five, or 15 minute?

For the numbers we've talked about (1.00 = fix it now, etc), you should be looking at the five or 15-minute averages. Frankly, if your box spikes above 1.0 on the one-minute average, you're still fine. It's when the 15-minute average goes north of 1.0 and stays there that you need to snap to. (obviously, as we've learned, adjust these numbers to the number of processor cores your system has).

So # of cores is important to interpreting load averages ... how do I know how many cores my system has?

cat /proc/cpuinfo to get info on each processor in your system. Note: not available on OSX, Google for alternatives. To get just a count, run it through grepand word count: grep 'model name' /proc/cpuinfo | wc -l

'프로그래밍' 카테고리의 다른 글

저장 검색의 복잡도 (0)	2015.04.08
git reset commit 살리기 (0)	2015.03.24
자바와 한글 인코딩(utf-8 유니코드) 문제 (0)	2015.01.08
자바스크립트 프로파일링(성능튜닝 포인트) (0)	2014.12.04
how to get key values in json object (0)	2014.11.22

//

nginx 처리되는 process 및 대기 request 모듈

http://nginx.org/en/docs/http/ngx_http_stub_status_module.html

//

LEFT Outer JOIN

1. OUTER JOIN

INNER JOIN이 JOIN 조건에 부합하는 행만 JOIN이 발생하는 것이라면,

OUTER JOIN은 조건에 부합하지 않는 행까지도 포함시켜 결합하는 것을 의미한다.

자주는 아니지만, 가끔 유용하게 사용될 수 있으므로 꼭 알아둘 필요는 있다.

기본 구문은 아래와 같다.

SELECT <열 목록>
FROM <첫번째 테이블 (LEFT 테이블)>
<LEFT | RIGHT | FULL> OUTER JOIN <두번째 테이블 (RIGHT 테이블)>
ON <조인될 조건>
[WHERE 검색 조건]

INNER JOIN과 유사해 보이지만, LEFT, RIGHT, FULL의 새로운 키워드들이 보인다.

2. LEFT OUTER JOIN

LEFT OUTER JOIN은 왼쪽 테이블의 것은 조건에 부합하지 않더라도 모두 결합되어야 한다는 의미이다.

즉, FROM 첫번째 테이블 LEFT OUTER JOIN 두번째 테이블이라면, 첫번째 테이블의 것은 모두 출력되어야 한다.

예제를 살펴 보자.

-- 전체 회원의 구매기록을 살펴보자.
-- 단, 구매 기록이 없는 회원도 출력되어야 한다.
-- LEFT OUTER JOIN이므로, UserTable은 모두 출력된다
SELECT U.ID, Name, GoodName, Addr
FROM UserTable U -- LEFT Table
LEFT OUTER JOIN BuyTable B -- RIGHT Table
ON U.ID = B.ID
ORDER BY U.ID

INNER JOIN시 INNER 키워드를 생략 가능했던 것처럼,

LEFT OUTER JOIN 역시 LEFT JOIN만으로 작성해도 무방하다.

위 예제의 결과는 아래와 같다. (모든 UserTable의 행이 출력되었다)

3. RIGHT OUTER JOIN

RIGHT OUTER JOIN은 오른쪽 테이블의 것은 조건에 부합하지 않더라도 모두 결합되어야 한다는 의미이다.

즉, FROM 첫번째 테이블 RIGHT OUTER JOIN 두번째 테이블이라면, 두번째 테이블의 것은 모두 출력되어야 한다.

LEFT OUTER JOIN의 예제와 동일한 결과를 얻을 수 있도록 예제를 작성해 보자.

-- 전체 회원의 구매기록을 살펴보자.
-- 단, 구매 기록이 없는 회원도 출력되어야 한다.
-- RIGHT OUTER JOIN이므로, UserTable은 모두 출력된다
SELECT U.ID, Name, GoodName, Addr
FROM BuyTable B -- LEFT Table
RIGHT OUTER JOIN UserTable U -- RIGHT Table
ON B.ID = U.ID
ORDER BY U.ID

역시 RIGHT OUTER JOIN은 RIGHT JOIN만으로도 작성이 가능하다.

4. FULL OUTER JOIN

전체 조인 또는 전체 외부 조인이라고 한다.

FULL OUTER JOIN은 LEFT OUTER JOIN과 RIGHT OUTER JOIN을 합친 것이라고 생각하면 된다.

즉, 한쪽을 기준으로 조건과 일치하지 않는 것을 출력하는 것이 아니라,

양쪽 모두에 조건이 일치하지 않는 것들까지 모두 결합하는 개념이다.

따라서, 테이블들의 모든 행이 조건에 관계없이 결합된다.

5. 세 개 이상의 테이블 조인

INNER JOIN 문서의 예제를 재활용하자.

INNER JOIN 예제를 보면, 동아리에 가입하지 않은 '김제둥'은 결과에 포함되지 않았다.

이를 OUTER JOIN으로 동아리에 가입하지 않은 학생까지 출력되도록 예제를 작성해 보자.

SELECT S.Name, Addr, C.Name, RoomNo
FROM StudentTable S
-- 먼저 StudentTable이 모두 출력될 수 있도록 StdClubTable과 LEFT OUTER JOIN
LEFT OUTER JOIN StdClubTable SC
ON S.Name = SC.StdName
-- 그 결합에 다시 ClubTable을 LEFT OUTER JOIN
LEFT OUTER JOIN ClubTable C
ON SC.ClubName = C.Name

결과는 다음과 같이 동아리에 가입하지 않은 '김제둥'도 출력되었다.

이번엔 학생 기준이 아닌, 동아리를 기준으로 가입된 학생을 출력하되,

가입 학생이 하나도 없는 동아리라도 출력될 수 있도록 예제를 작성해 보자.

SELECT C.Name, RoomNo, S.Name, Addr
FROM StudentTable S
-- 먼저 StudentTable이 모두 출력될 수 있도록 StdClubTable과 LEFT OUTER JOIN
LEFT OUTER JOIN StdClubTable SC
ON S.Name = SC.StdName
-- 이후 클럽이 모두 출력될 수 있도록 ClubTable이 결합 결과를 RIGHT OUTER JOIN
RIGHT OUTER JOIN ClubTable C
ON SC.ClubName = C.Name

결과는 아래와 같다.

사실 위 예제에서 동아리에 가입하지 않은 학생의 경우 최종 목적 출력에 아무런 의미가 없으므로,

아래와 같이 작성하는 것이 더욱 깔끔하고 성능도 조금 더 낫다.

SELECT C.Name, RoomNo, S.Name, Addr
FROM StudentTable S
-- StudentTable과 StdClubTable을 INNER JOIN하여 동아리에 가입한 학생을 추출하고,
INNER JOIN StdClubTable SC
ON S.Name = SC.StdName
-- 모든 클럽이 출력될 수 있도록 위 INNER JOIN의 결합 결과를
-- ClubTable이 다시 RIGHT OUTER JOIN
RIGHT OUTER JOIN ClubTable C
ON SC.ClubName = C.Name

참조 : http://egloos.zum.com/sweeper/v/3002220

'데이터베이스' 카테고리의 다른 글

데이터베이스(MySql)의 replication과 network (0)	2015.12.21
mysql 날짜 함수 (0)	2015.12.16
MySQL 비밀번호 복구 (0)	2014.05.31
mysql 패스워드 분실시에 (0)	2014.04.04
오라클 scott 유저 활성화 (0)	2013.11.04

//

자바와 한글 인코딩(utf-8 유니코드) 문제

자바에서 기본 스트링 인코딩을 utf-16을 쓴다 16비트냐 32비트냐 인데 utf-8은 표현 메모리?구간 마다 길이가 가변적이다.

한글 완성형의 코드 포인트 범위는 U+AC00~U+D7AF이므로, UTF-8 인코딩에서 한글은 무조건 3바이트 인코딩이다. 그래서 URL에 파라미터 값이 %ED%95%9C%EA%B8%80과 같이 표시된다면 UTF-8 인코딩일 확률이 높다.한글의 모든 글자들은 3byte 구간에 있으므로 utf-8로 변환할 경우 char='한'은 3byte가 된다.

참조 http://helloworld.naver.com/helloworld/76650

'프로그래밍' 카테고리의 다른 글

git reset commit 살리기 (0)	2015.03.24
Linux Load average에 대해서 (0)	2015.03.10
자바스크립트 프로파일링(성능튜닝 포인트) (0)	2014.12.04
how to get key values in json object (0)	2014.11.22
Github 사용방법 및 등 (0)	2014.06.22

//

쉘스크립트 숫자반복

for i in {1..5}
do
   echo "$i"
done

'Linux' 카테고리의 다른 글

Xen Server에서 Ubuntu 설치시 "Unable to access a required file in the specified repository" 에러 해결 (0)	2014.10.25
VI에서 찾아바꾸기 등 (0)	2014.09.22
Mysql 컴파일 설치 절차(리눅스에서) (0)	2014.06.16
sed awk 사용법 (0)	2014.06.01
Linux process explorer (0)	2014.05.07

//

'분류 전체보기'에 해당되는 글 180건

dangling indices 문제(index 가 지워지지 않는 현상)

'프로그래밍 > ElasticSearch' 카테고리의 다른 글

MR(mapreduce) with java

'프로그래밍 > Hadoop ETC' 카테고리의 다른 글

MR job log 파일 뽑기

'프로그래밍 > Hadoop ETC' 카테고리의 다른 글

Lucene(elasticsearch) score 계산

'프로그래밍 > ElasticSearch' 카테고리의 다른 글

git reset commit 살리기

'프로그래밍' 카테고리의 다른 글

Linux Load average에 대해서

The traffic analogy

So you're saying the ideal load is 1.00?

What about Multi-processors? My load says 3.00, but things are running fine!

Multicore vs. multiprocessor

Bringing It Home

'프로그래밍' 카테고리의 다른 글

nginx 처리되는 process 및 대기 request 모듈

LEFT Outer JOIN

'데이터베이스' 카테고리의 다른 글

자바와 한글 인코딩(utf-8 유니코드) 문제

'프로그래밍' 카테고리의 다른 글

쉘스크립트 숫자반복

'Linux' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바