<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://jonghyunho.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://jonghyunho.github.io/" rel="alternate" type="text/html" /><updated>2026-02-25T06:21:44+09:00</updated><id>https://jonghyunho.github.io/feed.xml</id><title type="html">Jonghyun Ho</title><subtitle></subtitle><author><name>/Jonghyun Ho</name></author><entry><title type="html">QuantaAlpha: LLM이 진화 알고리즘으로 주식 알파 팩터를 발굴하는 방법</title><link href="https://jonghyunho.github.io/ai/finance/quant/QuantaAlpha-LLM-Alpha-Mining.html" rel="alternate" type="text/html" title="QuantaAlpha: LLM이 진화 알고리즘으로 주식 알파 팩터를 발굴하는 방법" /><published>2026-02-25T06:00:00+09:00</published><updated>2026-02-25T06:00:00+09:00</updated><id>https://jonghyunho.github.io/ai/finance/quant/QuantaAlpha-LLM-Alpha-Mining</id><content type="html" xml:base="https://jonghyunho.github.io/ai/finance/quant/QuantaAlpha-LLM-Alpha-Mining.html"><![CDATA[<blockquote>
  <p>📄 논문: <strong><a href="https://arxiv.org/abs/2602.07085">QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining</a></strong>
(Jun Han, Shuo Zhang, Wei Li 외 24인 - 상하이재경대(SUFE), QuantaAlpha社, Stanford, 북경대(PKU), 2026.02)</p>
</blockquote>

<hr />

<h2 id="1-배경-알파-팩터란-무엇인가">1. 배경: 알파 팩터란 무엇인가?</h2>

<p>퀀트 투자의 핵심 목표는 <strong>알파(Alpha)</strong>, 즉 시장 전체 수익률(베타)을 초과하는 초과 수익을 만드는 것입니다. 이를 위해 <strong>알파 팩터(Alpha Factor)</strong> 를 발굴합니다.</p>

<p>알파 팩터란 주식의 미래 수익률을 예측하는 수식입니다. 예를 들어:</p>

<ul>
  <li><strong>모멘텀 팩터:</strong> 최근 6개월간 수익률이 높은 종목은 향후에도 오를 가능성이 높다</li>
  <li><strong>평균회귀 팩터:</strong> 단기 급락한 종목은 다시 올라올 가능성이 높다</li>
  <li><strong>거래량 팩터:</strong> 비정상적으로 거래량이 늘어난 종목에는 정보가 있다</li>
</ul>

<p>수학적으로는 N개 종목, T개 시점, D개 특성값을 담은 시장 데이터 행렬 <strong>X ∈ R^(N×T×D)</strong> 에서, 다음 시점의 횡단면 수익률 <strong>y(t+1)</strong> 을 예측하는 함수 <strong>f</strong>를 찾는 문제입니다.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>f(X_t) → y(t+1)
</code></pre></div></div>

<p>목표는 예측력(IC)을 극대화하면서도 불필요하게 복잡하지 않은 팩터를 찾는 것입니다.</p>

<h3 id="왜-어려운가">왜 어려운가?</h3>

<p>주식 시장은 매우 다루기 어려운 환경입니다:</p>

<ol>
  <li><strong>신호 대 잡음 비율이 극도로 낮다</strong>: 의미 있는 패턴이 시장 노이즈에 묻혀있음</li>
  <li><strong>비정상성(Non-stationarity)</strong>: 시장 체제가 계속 변함 (ex: 대형주 장세 → 소형주 테마주 장세)</li>
  <li><strong>알파 붕괴(Alpha Decay)</strong>: 좋은 팩터가 알려지면 많은 투자자가 따라하고, 팩터의 효과가 사라짐</li>
  <li><strong>고차원성</strong>: 가격, 거래량, 재무 데이터 등 수백 개의 변수 조합이 가능</li>
</ol>

<hr />

<h2 id="2-기존-방법들과-그-한계">2. 기존 방법들과 그 한계</h2>

<h3 id="전통적-머신러닝딥러닝">전통적 머신러닝/딥러닝</h3>

<p>XGBoost, LSTM, Transformer 등의 딥러닝 모델을 이용한 수익률 예측은 이미 많이 연구됐습니다. 하지만 이들은 <strong>블랙박스</strong>이며 “왜 이 종목이 오르는가”를 설명할 수 없습니다.</p>

<h3 id="llm-기반-에이전트-프레임워크-1세대">LLM 기반 에이전트 프레임워크 (1세대)</h3>

<p>최근에는 LLM을 활용해 “퀀트 연구원의 작업 흐름”을 자동화하려는 시도들이 나왔습니다:</p>

<ul>
  <li><strong>RD-Agent</strong>: 연구(Research) 에이전트와 개발(Development) 에이전트를 분리해 팩터+모델을 공동 최적화</li>
  <li><strong>AlphaAgent</strong>: 팩터 생성 단계에서 정규화를 적용해 알파 붕괴 억제</li>
</ul>

<p>일반적인 작업 흐름:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>① 가설 생성 (Hypothesis Generation)
    ↓
② 팩터 구현 (Factor Construction)
    ↓
③ 백테스트 평가 (Backtesting)
    ↓
④ 결과를 바탕으로 가설 수정 → ①로 반복
</code></pre></div></div>

<p><strong>그러나 이 방식에는 세 가지 핵심 한계가 있습니다:</strong></p>

<table>
  <thead>
    <tr>
      <th>한계</th>
      <th>설명</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>제어 불가능성 (Fragile Controllability)</strong></td>
      <td>노이즈 많은 백테스트 결과에 이끌려 개선하다 보면, 원래 경제적 의미에서 멀어지는 “의미 표류(Semantic Drift)”가 발생</td>
    </tr>
    <tr>
      <td><strong>낮은 신뢰성 (Limited Trustworthiness)</strong></td>
      <td>검증된 좋은 아이디어를 체계적으로 다음 iteration에 물려주지 못함. 어떤 이유로 좋은 결과가 나왔는지 추적이 어려움</td>
    </tr>
    <tr>
      <td><strong>제한된 탐색 (Constrained Exploration)</strong></td>
      <td>초기 아이디어 근처만 반복 탐색하는 지역 최적화 문제. 다양한 가능성을 충분히 탐구하지 못함</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="3-quantaalpha의-핵심-아이디어">3. QuantaAlpha의 핵심 아이디어</h2>

<p><strong>“각각의 알파 발굴 실행(run) 전체를 하나의 궤적(Trajectory)으로 보고, 궤적 자체를 진화시키자”</strong></p>

<p>기존 방법이 개별 단계의 결과물(팩터 코드, 가설 문장)을 수정하는 데 집중했다면, QuantaAlpha는 <strong>가설 생성부터 백테스트 평가까지의 전체 과정</strong>을 하나의 단위로 취급합니다.</p>

<h3 id="궤적trajectory이란">궤적(Trajectory)이란?</h3>

<p>하나의 알파 발굴 실행은 다음과 같은 순서열로 표현됩니다:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>τ = (s0, a0, s1, a1, ..., sn)
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">s0</code>: 초기 컨텍스트 (시장 상황, 사용자가 준 시드 팩터)</li>
  <li><code class="language-plaintext highlighter-rouge">ai</code>: i번째 단계에서 에이전트가 취한 행동</li>
  <li><code class="language-plaintext highlighter-rouge">sn</code>: 최종 상태 (백테스트 결과)</li>
</ul>

<p>궤적의 품질은 최종 보상으로 측정합니다:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>R(τ) = L(f_τ(X), y) - λR(f_τ)
       ^^^^^^^^^^^^^^^^^^^^^^^^^^
       예측력           복잡도 페널티
</code></pre></div></div>

<p><strong>목표:</strong> 이 보상을 최대화하는 궤적 생성 정책 π*를 찾는 것.</p>

<hr />

<h2 id="4-quantaalpha의-4가지-핵심-구성요소">4. QuantaAlpha의 4가지 핵심 구성요소</h2>

<h3 id="구성요소-a-다양화된-초기-계획-diversified-planning-initialization">구성요소 A: 다양화된 초기 계획 (Diversified Planning Initialization)</h3>

<p>초기화 에이전트가 서로 보완적인 다양한 가설들을 동시에 만들어냅니다.</p>

<p>다양성 확보 기준:</p>
<ul>
  <li><strong>신호 출처</strong>: 가격 신호 vs 거래량 신호 vs 재무 지표</li>
  <li><strong>시간 스케일</strong>: 단기(5일) vs 중기(20일) vs 장기(60일)</li>
  <li><strong>메커니즘 유형</strong>: 모멘텀 vs 평균 회귀 vs 레짐 조건부 신호</li>
</ul>

<blockquote>
  <p>💡 “씨앗을 한 곳에만 뿌리지 않고 밭 전체에 고르게 뿌린다”는 원칙입니다. 좁은 지역 최적에 일찍 수렴하는 위험을 줄입니다.</p>
</blockquote>

<hr />

<h3 id="구성요소-b-제어-가능한-팩터-구성-controllable-factor-construction">구성요소 B: 제어 가능한 팩터 구성 (Controllable Factor Construction)</h3>

<p>팩터를 바로 Python 코드로 생성하면 세 가지 문제가 생깁니다: 문법 오류, 의존성 불일치, 의미 표류. QuantaAlpha는 중간 표현으로 <strong>추상 구문 트리(AST)</strong> 를 도입합니다.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>가설 h (자연어)
    "10일 저점 대비 현재 가격의 편차를 거래량 가중치로 조정"
    ↓ [아이디어 에이전트]
의미적 설명 d
    "TS_MIN(close, 10)에서의 편차 / 거래량 정규화"
    ↓ [팩터 에이전트]
심볼릭 표현 f (AST)
    RANK(DIV(SUB(close, TS_MIN(close, 10)), SMA(volume, 10)))
    ↓ [컴파일러]
실행 가능 코드 c (Python)
</code></pre></div></div>

<p><strong>AST(추상 구문 트리)의 구조:</strong></p>
<ul>
  <li><strong>잎 노드(Leaf Nodes)</strong>: 원시 특성값 (예: <code class="language-plaintext highlighter-rouge">$close</code>, <code class="language-plaintext highlighter-rouge">$volume</code>, <code class="language-plaintext highlighter-rouge">$high</code>)</li>
  <li><strong>내부 노드(Internal Nodes)</strong>: 연산자 인스턴스 (예: <code class="language-plaintext highlighter-rouge">TS_MIN()</code>, <code class="language-plaintext highlighter-rouge">SMA()</code>, <code class="language-plaintext highlighter-rouge">RANK()</code>)</li>
</ul>

<p>이를 통해 계산 의존성과 데이터 흐름이 완전히 투명하게 됩니다.</p>

<h4 id="일관성-검증-consistency-verification">일관성 검증 (Consistency Verification)</h4>

<p>LLM 검증기가 두 가지를 확인합니다:</p>
<ol>
  <li><strong>가설 h ↔ 의미적 설명 d ↔ 심볼릭 표현 f</strong> 간의 의미적 정렬</li>
  <li><strong>심볼릭 표현 f ↔ 생성된 코드 c</strong> 간의 충실도</li>
</ol>

<p>검증 실패 시 문제가 있는 단계만 재생성합니다.</p>

<h4 id="복잡도--중복성-제어">복잡도 &amp; 중복성 제어</h4>

<p><strong>복잡도 측정:</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C(f) = α₁·SL(f) + α₂·PC(f) + α₃·log(1+|F_f|)
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">SL(f)</code>: 심볼릭 길이 (표현식의 길이)</li>
  <li><code class="language-plaintext highlighter-rouge">PC(f)</code>: 자유 파라미터 수 (윈도우 크기 등)</li>
  <li><code class="language-plaintext highlighter-rouge">F_f</code>: 사용된 원시 특성의 집합</li>
</ul>

<p><strong>중복성 측정:</strong> 두 팩터의 AST에서 동일한 부분 트리의 최대 크기로 구조적 유사도를 계산합니다. 기존 팩터 풀(alpha zoo)과의 유사도가 임계값을 넘으면 거절하고 재생성합니다.</p>

<hr />

<h3 id="구성요소-c-자기-진화-self-evolution">구성요소 C: 자기 진화 (Self-Evolution)</h3>

<p>이것이 QuantaAlpha의 핵심입니다. <strong>생물의 진화(Mutation + Crossover)</strong> 에서 착안했습니다.</p>

<h4 id="-변이-mutation--탐색의-핵심">🧬 변이 (Mutation) — 탐색의 핵심</h4>

<p>낮은 보상의 궤적에서 <strong>가장 문제가 되는 단계만</strong> 찾아 수정합니다. 나머지 단계는 동결(freeze).</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>기존 궤적:
  [가설A 생성] → [심볼릭 표현α] → [코드 생성] → [백테스트: IC=0.05]
                                                          ↑ 낮은 보상

자기 반성(Self-Reflection):
  "심볼릭 표현α에서 시간 스케일이 너무 짧아 노이즈에 취약함"

변이 후:
  [가설A 생성] → [심볼릭 표현α' (10일→20일로 수정)] → [코드 재생성] → [IC=0.12]
  ^^^^^^^^^^^       ↑                                   ^^^^^^^^^^^
  동결             수정된 부분                           자동 재생성
</code></pre></div></div>

<p>변이는 다음과 같은 메커니즘 수준 변화를 포함할 수 있습니다:</p>
<ul>
  <li>시간 스케일 변경 (5일 → 20일)</li>
  <li>레짐 조건 추가 (무조건 모멘텀 → 저변동성 시기에만 모멘텀)</li>
  <li>신호 채널 교체 (가격 기반 → 거래량 기반)</li>
</ul>

<h4 id="-교차-crossover--검증된-패턴의-재활용">🔀 교차 (Crossover) — 검증된 패턴의 재활용</h4>

<p>성능이 높은 여러 궤적에서 <strong>강점이 되는 구간만 선택해 조합</strong>합니다.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>부모 궤적 1 (IC=0.12):  [좋은 가설 구조] → [평범한 구현] → [평범한 수정]
부모 궤적 2 (IC=0.11):  [평범한 가설]   → [좋은 구현 방식] → [효과적인 오류 수정]

↓ 교차

자식 궤적:             [좋은 가설 구조] → [좋은 구현 방식] → [효과적인 오류 수정]
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                        각 부모의 강점을 물려받아 결합
</code></pre></div></div>

<p>이는 인간 퀀트 연구원들이 서로 다른 전략의 장점을 결합해 새로운 전략을 만드는 방식을 모방한 것입니다.</p>

<hr />

<h3 id="구성요소-d-최종-팩터-풀-final-factor-pool">구성요소 D: 최종 팩터 풀 (Final Factor Pool)</h3>

<p>진화 과정에서 검증된 팩터들을 축적·관리합니다.</p>

<p><strong>팩터 추가 규칙 (탐욕적 RankIC 기반):</strong></p>
<ol>
  <li>모든 후보 팩터를 RankIC 내림차순으로 정렬</li>
  <li>이미 풀에 있는 팩터와의 절대 상관계수가 0.7 미만인 경우에만 추가</li>
  <li>풀 크기는 전체 발굴 팩터의 50%로 제한</li>
</ol>

<p>→ <strong>다양성을 유지하면서 알파 붕괴(factor crowding)를 방지</strong></p>

<hr />

<h2 id="5-실험-결과">5. 실험 결과</h2>

<h3 id="51-실험-설정">5.1 실험 설정</h3>

<ul>
  <li><strong>데이터셋</strong>: CSI 300 (중국 대형주 300개 A주)</li>
  <li><strong>학습 기간</strong>: 2016.01 ~ 2020.12</li>
  <li><strong>검증 기간</strong>: 2021.01 ~ 2021.12</li>
  <li><strong>테스트 기간</strong>: 2022.01 ~ 2025.12 (4년간)</li>
  <li><strong>주요 지표</strong>: IC, ICIR, Rank IC, Rank ICIR, ARR, MDD, IR (SHR), CR</li>
</ul>

<h3 id="52-전체-비교-결과-csi-300">5.2 전체 비교 결과 (CSI 300)</h3>

<table>
  <thead>
    <tr>
      <th>방법</th>
      <th>모델</th>
      <th>IC</th>
      <th>ICIR</th>
      <th>ARR (%)</th>
      <th>MDD (%)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>전통 ML</strong></td>
      <td>LightGBM</td>
      <td>0.0247</td>
      <td>0.2055</td>
      <td>0.07</td>
      <td>21.80</td>
    </tr>
    <tr>
      <td><strong>딥러닝</strong></td>
      <td>Transformer</td>
      <td>0.0331</td>
      <td>0.2702</td>
      <td>5.21</td>
      <td>13.81</td>
    </tr>
    <tr>
      <td> </td>
      <td>TRA</td>
      <td>0.0421</td>
      <td>0.3402</td>
      <td>6.81</td>
      <td>8.51</td>
    </tr>
    <tr>
      <td><strong>팩터 라이브러리</strong></td>
      <td>Alpha158</td>
      <td>0.0131</td>
      <td>0.0817</td>
      <td>2.66</td>
      <td>10.15</td>
    </tr>
    <tr>
      <td><strong>RD-Agent</strong></td>
      <td>GPT-5.2</td>
      <td>0.0531</td>
      <td>0.4300</td>
      <td>9.91</td>
      <td>14.82</td>
    </tr>
    <tr>
      <td><strong>AlphaAgent</strong></td>
      <td>Claude-4.5</td>
      <td>0.1092</td>
      <td>0.7718</td>
      <td>16.48</td>
      <td>8.14</td>
    </tr>
    <tr>
      <td> </td>
      <td>GPT-5.2</td>
      <td>0.0966</td>
      <td>0.6344</td>
      <td>15.54</td>
      <td>12.89</td>
    </tr>
    <tr>
      <td><strong>QuantaAlpha</strong></td>
      <td>DeepSeek-V3.2</td>
      <td>0.1338</td>
      <td>0.8533</td>
      <td>23.77</td>
      <td>9.14</td>
    </tr>
    <tr>
      <td> </td>
      <td>Claude-4.5</td>
      <td>0.1111</td>
      <td>0.6374</td>
      <td>22.70</td>
      <td>6.96</td>
    </tr>
    <tr>
      <td>🏆 <strong>QuantaAlpha</strong></td>
      <td><strong>GPT-5.2</strong></td>
      <td><strong>0.1501</strong></td>
      <td><strong>0.9110</strong></td>
      <td><strong>27.75</strong></td>
      <td><strong>7.98</strong></td>
    </tr>
  </tbody>
</table>

<p><strong>핵심 성과 (GPT-5.2 기준):</strong></p>
<ul>
  <li>RD-Agent 대비: IC +0.0970 ↑, ARR +17.84%p ↑, MDD -6.84%p ↓</li>
  <li>AlphaAgent 대비: IC +0.0535 ↑, ARR +12.21%p ↑, MDD -4.91%p ↓</li>
</ul>

<p>어떤 백본 LLM을 써도 (Qwen, DeepSeek, Gemini, Claude, GPT 모두) QuantaAlpha가 일관되게 상위 성능을 보여 <strong>모델 의존성이 낮다</strong>는 것도 중요한 결과입니다.</p>

<h3 id="53-시장-전이cross-market-transfer-성능">5.3 시장 전이(Cross-Market Transfer) 성능</h3>

<p><strong>CSI 300에서 발굴한 팩터를 재최적화 없이 다른 시장에 그대로 적용:</strong></p>

<table>
  <thead>
    <tr>
      <th>적용 시장</th>
      <th>4년 누적 초과 수익률</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>CSI 500 (중국 중형주)</td>
      <td><strong>약 +160%</strong></td>
    </tr>
    <tr>
      <td>S&amp;P 500 (미국 시장)</td>
      <td><strong>약 +137%</strong></td>
    </tr>
  </tbody>
</table>

<p>특히 <strong>2023년 12월경부터</strong> 경쟁 방법들이 시장 국면 전환에 따라 성과가 정체되는 반면, QuantaAlpha는 안정적인 상승 궤적을 유지합니다.</p>

<hr />

<h2 id="6-절제-연구-ablation-study">6. 절제 연구 (Ablation Study)</h2>

<h3 id="61-진화-구성요소별-기여도">6.1 진화 구성요소별 기여도</h3>

<p>각 구성요소를 하나씩 제거했을 때의 영향:</p>

<table>
  <thead>
    <tr>
      <th>제거된 요소</th>
      <th>IC 변화</th>
      <th>Rank IC 변화</th>
      <th>ARR 변화</th>
      <th>MDD 변화</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>기준 (QuantaAlpha)</td>
      <td>0.1493</td>
      <td>0.1458</td>
      <td>28.99%</td>
      <td>9.42%</td>
    </tr>
    <tr>
      <td>초기화 계획 제거</td>
      <td>-0.0005</td>
      <td>-0.0006</td>
      <td><strong>-7.78%</strong></td>
      <td>+2.73%</td>
    </tr>
    <tr>
      <td>변이(Mutation) 제거</td>
      <td><strong>-0.0292</strong></td>
      <td><strong>-0.0284</strong></td>
      <td><strong>-9.81%</strong></td>
      <td>+0.43%</td>
    </tr>
    <tr>
      <td>교차(Crossover) 제거</td>
      <td>-0.0070</td>
      <td>-0.0077</td>
      <td>-2.82%</td>
      <td>+1.21%</td>
    </tr>
  </tbody>
</table>

<p><strong>해석:</strong></p>
<ul>
  <li><strong>변이(Mutation)</strong>: 예측력(IC)과 수익률(ARR) 모두에 가장 큰 영향. “좋은 탐색”의 핵심</li>
  <li><strong>다양화 초기화</strong>: IC보다 수익률/리스크에 큰 영향. 안정적인 진화를 위한 기반</li>
  <li><strong>교차(Crossover)</strong>: 상대적으로 작은 기여지만, 검증된 패턴 재활용으로 안정성 향상</li>
</ul>

<h3 id="62-팩터-생성-제어의-기여도">6.2 팩터 생성 제어의 기여도</h3>

<p>세 가지 제약(일관성 검증, 복잡도 제어, 중복성 필터) 중 어떤 하나만 제거해도 성능이 하락합니다. 특히 <strong>복잡도 제어</strong> 제거 시 연간 초과 수익 -8.44%, MDD +2.57%로 전략 수준에서 가장 큰 타격을 받습니다.</p>

<hr />

<h2 id="7-알파-붕괴-분석-2023년-중국-시장-국면-전환">7. 알파 붕괴 분석: 2023년 중국 시장 국면 전환</h2>

<p>이 섹션은 논문에서 가장 실전적이고 흥미로운 부분입니다.</p>

<h3 id="71-2023년-무슨-일이-있었나">7.1 2023년 무슨 일이 있었나?</h3>

<p>중국 A주 시장은 2023년에 <strong>뚜렷한 스타일 전환</strong>을 겪었습니다:</p>
<ul>
  <li><strong>전환 전 (2016~2022)</strong>: 기관 주도 대형주 장세. 안정적인 추세, 규칙적인 평균회귀</li>
  <li><strong>전환 후 (2023~)</strong>: 소형주·테마주 중심. 높은 일중 노이즈, 잦은 오버나이트 갭, 빠른 섹터 로테이션</li>
</ul>

<p>이 전환으로 인해 기존 팩터들이 2023년에 대거 효력을 잃었습니다.</p>

<h3 id="72-quantaalpha-vs-alphaagent-2023년-팩터-성능-비교">7.2 QuantaAlpha vs AlphaAgent: 2023년 팩터 성능 비교</h3>

<p><strong>QuantaAlpha의 강세 팩터들:</strong></p>

<table>
  <thead>
    <tr>
      <th>팩터</th>
      <th>Rank IC</th>
      <th>설명</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">GapZ10_Overnight_vs_TR</code></td>
      <td>0.0793</td>
      <td>오버나이트 갭의 크기를 최근 진정범위(True Range) 대비 정규화. 콜옥션에 의한 충격과 이후 조정을 포착</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Gap_IntradayAcceptanceScore_20D</code></td>
      <td>0.0744</td>
      <td>오버나이트 갭의 “수용(acceptance) vs 거부(rejection)”를 일중 방향성으로 판단. 최근 변동성으로 스케일링</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Gap_IntradayAcceptance_VolWeighted_20D</code></td>
      <td>0.0606</td>
      <td>비정상적 거래량으로 가중된 갭 수용 점수. 정보 많은 개장에 초점</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">CleanTrend_Continuation_Score_RS10_WVMA5</code></td>
      <td>0.0590</td>
      <td>낮은 잔차 노이즈와 약한 거래량 압력 조건에서만 추세 지속성을 포착</td>
    </tr>
  </tbody>
</table>

<p><strong>AlphaAgent의 강세 팩터들:</strong></p>

<table>
  <thead>
    <tr>
      <th>팩터</th>
      <th>Rank IC</th>
      <th>설명</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Exhaustion_Intensity_Index_10D</code></td>
      <td>0.0323</td>
      <td>60일 가격 변위 × 거래량 강도. 고갈(Exhaustion) 및 반전 포착</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">Climax_Exhaustion_Intensity</code></td>
      <td>0.0242</td>
      <td>단기 거래량 클라이맥스 vs 장기 기준선. 항복(Capitulation)성 반전 식별</td>
    </tr>
  </tbody>
</table>

<h3 id="73-왜-quantaalpha-팩터들이-더-강건한가">7.3 왜 QuantaAlpha 팩터들이 더 강건한가?</h3>

<p>2023년 소형주 테마 장세에서:</p>

<ol>
  <li><strong>오버나이트 갭 신호</strong>: 장 마감 이후 공시·뉴스 등 비거래시간 정보가 집적됨. 일중 예측력이 떨어질 때 오히려 이 채널의 중요성이 커짐</li>
  <li><strong>변동성 구조 신호</strong>: 변동성 클러스터링은 시장 스타일이 바뀌어도 지속되는 마이크로구조적 특성</li>
  <li><strong>추세 품질 조건부 신호</strong>: 낮은 잔차 변동성 + 유동성 확인 시에만 추세 지속을 추종 → 소형주의 노이즈성 가짜 추세에 덜 속음</li>
</ol>

<p><strong>요약 통계 비교 (2023년):</strong></p>

<table>
  <thead>
    <tr>
      <th>지표</th>
      <th>QuantaAlpha</th>
      <th>AlphaAgent</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>유효 지표 커버리지</td>
      <td>98%</td>
      <td>80%</td>
    </tr>
    <tr>
      <td>Rank IC &gt; 0 비율</td>
      <td>62.6%</td>
      <td>59.4%</td>
    </tr>
    <tr>
      <td><strong>평균 Rank IC</strong></td>
      <td><strong>0.0057</strong></td>
      <td>0.0012</td>
    </tr>
    <tr>
      <td><strong>Rank IC &gt; 0.03 비율</strong></td>
      <td><strong>10.2%</strong></td>
      <td>1.56%</td>
    </tr>
    <tr>
      <td><strong>Rank IC &gt; 0.05 비율</strong></td>
      <td><strong>2.72%</strong></td>
      <td>0.00%</td>
    </tr>
  </tbody>
</table>

<p>QuantaAlpha는 변이(Mutation) 메커니즘을 통해 다양한 정보 채널에 걸친 팩터 집단을 유지하기 때문에, 시장 스타일이 전환되어도 그 중 일부 팩터가 여전히 유효하게 작동합니다.</p>

<hr />

<h2 id="8-반복적-발전-분석-iteration-analysis">8. 반복적 발전 분석 (Iteration Analysis)</h2>

<h3 id="81-진화-효율성">8.1 진화 효율성</h3>

<p>5번의 iteration에 걸쳐 IC 분포를 추적한 결과:</p>
<ul>
  <li>QuantaAlpha: 초반에 빠르게 IC가 올라가고 높은 수준에서 안정</li>
  <li>AlphaAgent: QuantaAlpha보다 낮은 수준에서 수렴</li>
  <li>RD-Agent: 가장 낮고 동질적인 IC 분포 (다양성 부족)</li>
</ul>

<h3 id="82-수렴-분석-case-study-deepseek-v32-15-iterations">8.2 수렴 분석 (Case Study: DeepSeek-V3.2, 15 iterations)</h3>

<p><strong>1~5 iteration 동안의 팩터 진화 과정:</strong></p>

<ul>
  <li><strong>1차 iteration</strong>: 단기 반전 팩터 (해석 가능하고 간단한 수식)</li>
  <li><strong>2차 iteration</strong>: 변동성 가중 모멘텀으로 메커니즘 확장. 하지만 복잡도 증가로 일반화 약화</li>
  <li><strong>3~4차 iteration</strong>: 선형 가산 형태로 단순화. MDD 개선 및 성능 안정화</li>
  <li><strong>5차 iteration</strong>: 시장 참여자 행동 구분 신호 추가. 상호보완적 정보로 예측력 향상</li>
</ul>

<p><strong>최적 iteration 수:</strong>
성능은 iteration 11~12번째에서 최고 균형점(수익률 vs 낙폭 최적화)에 도달하며, 이 시점에서 약 350개의 팩터가 풀에 축적됩니다. 이후에는 중복 정보가 늘어나 오히려 전략 강건성이 떨어집니다.</p>

<hr />

<h2 id="9-왜-이것이-중요한가">9. 왜 이것이 중요한가?</h2>

<h3 id="91-퀀트-투자의-새로운-패러다임">9.1 퀀트 투자의 새로운 패러다임</h3>

<table>
  <thead>
    <tr>
      <th>구분</th>
      <th>기존 접근</th>
      <th>QuantaAlpha</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>방법론</td>
      <td>인간 연구원이 가설 생성</td>
      <td>LLM이 다양한 가설 자동 생성</td>
    </tr>
    <tr>
      <td>탐색 범위</td>
      <td>인간의 직관 범위 내</td>
      <td>가격, 거래량, 행동, 마이크로구조 등 광범위</td>
    </tr>
    <tr>
      <td>개선 방식</td>
      <td>수동 백테스트 후 수정</td>
      <td>진화 알고리즘으로 자동 개선</td>
    </tr>
    <tr>
      <td>설명 가능성</td>
      <td>높음</td>
      <td>높음 (AST 기반 투명한 표현)</td>
    </tr>
    <tr>
      <td>알파 붕괴 대응</td>
      <td>수동 모니터링 및 교체</td>
      <td>다양한 팩터 풀로 자동 대응</td>
    </tr>
  </tbody>
</table>

<h3 id="92-해석-가능한-ai">9.2 해석 가능한 AI</h3>

<p>딥러닝 기반 블랙박스 모델과 달리, QuantaAlpha는:</p>
<ul>
  <li>자연어 가설 → 수식 → 코드의 전 과정을 추적 가능</li>
  <li>어떤 시장 메커니즘을 믿고 투자하는지 설명 가능</li>
  <li>규제 기관에 전략 근거 제시 가능 (금융 규제 중요)</li>
</ul>

<hr />

<h2 id="10-한계와-주의사항">10. 한계와 주의사항</h2>

<ol>
  <li><strong>백테스트 과적합 위험</strong>: 아무리 좋은 방법론도 과거 데이터 기반 최적화는 항상 과적합 위험 존재</li>
  <li><strong>거래비용 미반영</strong>: 실제 투자에서는 슬리피지, 수수료, 시장 충격 비용이 수익률을 크게 깎음</li>
  <li><strong>LLM 운영비용</strong>: GPT-5.2로 15 iteration을 돌리면 상당한 API 비용 발생. 소규모 투자자에게는 진입 장벽</li>
  <li><strong>한국 시장 미검증</strong>: 논문은 CSI 300/500(중국)과 S&amp;P 500(미국)만 검증. 코스피/코스닥의 고유한 특성(결제일, 외국인 수급, 공매도 제한 등)에서의 성능은 별도 검증 필요</li>
  <li><strong>데이터 접근성</strong>: 고품질 tick 데이터, 오버나이트 갭 데이터 등은 기관 투자자에 비해 개인 투자자가 구하기 어려울 수 있음</li>
</ol>

<hr />

<h2 id="결론">결론</h2>

<p>QuantaAlpha는 <strong>LLM과 진화 알고리즘의 결합</strong>을 통해 알파 팩터 발굴 분야에 새로운 기준을 세웠습니다.</p>

<p>핵심 기여를 한 줄로 요약하면: <strong>“좋은 알파 발굴 과정(궤적) 자체를 유전자처럼 물려주고 교배시켜, 점점 더 좋은 팩터를 찾는다.”</strong></p>

<p>특히 2023년 중국 시장의 국면 전환을 돌파한 사례는, 이 시스템이 단순히 과거 데이터를 외운 것이 아니라 진정한 의미의 <strong>구조적 팩터</strong>를 발굴한다는 것을 보여줍니다.</p>

<p>AI가 퀀트 투자의 알파를 스스로 찾는 시대, 빠르게 다가오고 있습니다.</p>

<hr />

<blockquote>
  <p><strong>📌 참고 논문</strong>: <a href="https://arxiv.org/abs/2602.07085">arXiv:2602.07085</a></p>

  <p><em>본 포스팅은 논문의 내용을 정리·해석한 것이며, 투자 조언이 아닙니다. 실제 투자는 항상 신중하게 판단하시기 바랍니다.</em></p>
</blockquote>]]></content><author><name>Jonghyun Ho</name></author><category term="AI" /><category term="Finance" /><category term="Quant" /><category term="LLM" /><category term="Alpha Mining" /><category term="Quant" /><category term="AI Agent" /><category term="주식" /><category term="딥러닝" /><category term="진화알고리즘" /><summary type="html"><![CDATA[📄 논문: QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining (Jun Han, Shuo Zhang, Wei Li 외 24인 - 상하이재경대(SUFE), QuantaAlpha社, Stanford, 북경대(PKU), 2026.02)]]></summary></entry><entry><title type="html">Learning CarRacing environment with stable-baselines3</title><link href="https://jonghyunho.github.io/reinforcement/learning/learning-carracing-env-with-stable-baselines3.html" rel="alternate" type="text/html" title="Learning CarRacing environment with stable-baselines3" /><published>2022-06-05T00:00:00+09:00</published><updated>2022-06-05T00:00:00+09:00</updated><id>https://jonghyunho.github.io/reinforcement/learning/learning-carracing-env-with-stable-baselines3</id><content type="html" xml:base="https://jonghyunho.github.io/reinforcement/learning/learning-carracing-env-with-stable-baselines3.html"><![CDATA[<p>강화학습을 좀 더 쉽게 할 수 있도록 도와주는 라이브러리인 stable-baselines3 를 활용하여 <code class="language-plaintext highlighter-rouge">CarRacing</code> 환경을 학습해본다.</p>

<h2 id="환경-및-설치">환경 및 설치</h2>

<p><code class="language-plaintext highlighter-rouge">Windows 10</code> 의 Anaconda 환경</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;</span> pip <span class="nb">install </span>stable-baselines3[extra]
</code></pre></div></div>

<h2 id="모델-학습">모델 학습</h2>

<p>PPO 알고리즘을 이용하여 학습한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># train.py
</span>
<span class="kn">import</span> <span class="nn">gym</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">from</span> <span class="nn">stable_baselines3</span> <span class="kn">import</span> <span class="n">PPO</span>
<span class="kn">from</span> <span class="nn">stable_baselines3.common.callbacks</span> <span class="kn">import</span> <span class="n">EvalCallback</span>

<span class="n">env</span> <span class="o">=</span> <span class="n">gym</span><span class="p">.</span><span class="n">make</span><span class="p">(</span><span class="s">'CarRacing-v0'</span><span class="p">)</span>

<span class="n">log_path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="s">'./Training/Logs'</span><span class="p">)</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">PPO</span><span class="p">(</span><span class="s">'CnnPolicy'</span><span class="p">,</span> <span class="n">env</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">tensorboard_log</span><span class="o">=</span><span class="n">log_path</span><span class="p">)</span>
<span class="n">ppo_path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="s">'./Training/Saved_Models/PPO_car_best_Model'</span><span class="p">)</span>
<span class="n">eval_env</span> <span class="o">=</span> <span class="n">model</span><span class="p">.</span><span class="n">get_env</span><span class="p">()</span>
<span class="n">eval_callback</span> <span class="o">=</span> <span class="n">EvalCallback</span><span class="p">(</span><span class="n">eval_env</span><span class="o">=</span><span class="n">eval_env</span><span class="p">,</span> <span class="n">best_model_save_path</span><span class="o">=</span><span class="n">ppo_path</span><span class="p">,</span>
                             <span class="n">n_eval_episodes</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>
                             <span class="n">eval_freq</span><span class="o">=</span><span class="mi">50000</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
                             <span class="n">deterministic</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">render</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">model</span><span class="p">.</span><span class="n">learn</span><span class="p">(</span><span class="n">total_timesteps</span><span class="o">=</span><span class="mi">1000000</span><span class="p">,</span> <span class="n">callback</span><span class="o">=</span><span class="n">eval_callback</span><span class="p">)</span>
<span class="n">ppo_path</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="s">'./Training/Saved_Models/PPO_2m_Model_final'</span><span class="p">)</span>
<span class="n">model</span><span class="p">.</span><span class="n">save</span><span class="p">(</span><span class="n">ppo_path</span><span class="p">)</span>
</code></pre></div></div>

<h1 id="학습-진행-상황-확인">학습 진행 상황 확인</h1>

<p>학습 도중 Tensorboard 를 활용하여 학습 경과를 확인할 수 있다.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;</span> tensorboard <span class="nt">--logdir</span><span class="o">=</span>./
</code></pre></div></div>

<p>Tensorboard 실행 후 웹 브라우저에서 http://localhost:6006/ 접속</p>

<p><img src="/assets/img/posts/20220605/tensorboard-carracing-sb3.png" alt="tensorboard-carracing-sb3" /></p>

<p>100만 번의 timestep 을 학습하는 동안 <code class="language-plaintext highlighter-rouge">rollout/ep_rew_mean</code> 그래프에서는 학습 시 리워드 평균의 값을 확인할 수 있고</p>

<p><code class="language-plaintext highlighter-rouge">eval/mean_reward</code> 그래프에서는 학습한 모델을 중간 중간 평가할 때 기록이 되어 확인할 수 있다.</p>

<p>위 코드에서는 <code class="language-plaintext highlighter-rouge">eval_freq=50000</code> 으로 5만 번의 timestep 마다 학습 모델을 평가하고 있다.</p>

<h2 id="학습-모델-평가">학습 모델 평가</h2>

<p>저장된 강화학습 모델을 로드하여 CarRacing 환경의 동작을 확인한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># test.py
</span>
<span class="kn">import</span> <span class="nn">gym</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">from</span> <span class="nn">stable_baselines3</span> <span class="kn">import</span> <span class="n">PPO</span>

<span class="n">env</span> <span class="o">=</span> <span class="n">gym</span><span class="p">.</span><span class="n">make</span><span class="p">(</span><span class="s">'CarRacing-v0'</span><span class="p">)</span>

<span class="n">model</span> <span class="o">=</span> <span class="n">PPO</span><span class="p">.</span><span class="n">load</span><span class="p">(</span>
    <span class="s">'./Training/Saved_Models/PPO_car_best_Model/best_model.zip'</span><span class="p">,</span> <span class="n">env</span><span class="o">=</span><span class="n">env</span><span class="p">)</span>

<span class="n">obs</span> <span class="o">=</span> <span class="n">env</span><span class="p">.</span><span class="n">reset</span><span class="p">()</span>
<span class="n">episode_reward</span> <span class="o">=</span> <span class="mi">0</span>

<span class="n">done</span> <span class="o">=</span> <span class="bp">False</span>
<span class="k">while</span> <span class="ow">not</span> <span class="n">done</span><span class="p">:</span>
    <span class="n">env</span><span class="p">.</span><span class="n">render</span><span class="p">()</span>

    <span class="n">action</span><span class="p">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">model</span><span class="p">.</span><span class="n">predict</span><span class="p">(</span><span class="n">obs</span><span class="p">.</span><span class="n">copy</span><span class="p">())</span>
    <span class="n">obs</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">done</span><span class="p">,</span> <span class="n">info</span> <span class="o">=</span> <span class="n">env</span><span class="p">.</span><span class="n">step</span><span class="p">(</span><span class="n">action</span><span class="p">)</span>

    <span class="n">episode_reward</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">reward</span><span class="p">)</span>

<span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">'episode_reward: </span><span class="si">{</span><span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">episode_reward</span><span class="p">).</span><span class="n">mean</span><span class="p">()</span><span class="si">}</span><span class="s">'</span><span class="p">)</span>

<span class="n">env</span><span class="p">.</span><span class="n">close</span><span class="p">()</span>
</code></pre></div></div>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;</span> python test.py
Wrapping the <span class="nb">env </span>with a <span class="sb">`</span>Monitor<span class="sb">`</span> wrapper
Wrapping the <span class="nb">env </span><span class="k">in </span>a DummyVecEnv.
Wrapping the <span class="nb">env </span><span class="k">in </span>a VecTransposeImage.
Track generation: 1111..1393 -&gt; 282-tiles track
episode_reward: 0.8608540925266758
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">CarRacing</code> 환경에서 실행한 학습 모델의 모든 <code class="language-plaintext highlighter-rouge">Reward</code> 평균은 대략 <code class="language-plaintext highlighter-rouge">0.86</code> 이 측정되었고, 주행 모습은 다음과 같다.</p>

<p><img src="/assets/img/posts/20220605/carracing_ppo.gif" alt="carracing_ppo" /></p>

<h2 id="reference">Reference</h2>

<p><a href="https://www.kaggle.com/code/manthanbhagat/car-racing-stable-baselines/notebook">Car-Racing Stable Baselines</a></p>]]></content><author><name>Jonghyun Ho</name></author><category term="Reinforcement" /><category term="Learning" /><category term="OpenAI" /><category term="gym" /><category term="CarRacing" /><category term="Python" /><category term="Reinforcement Learning" /><category term="강화학습" /><category term="baselines" /><category term="stable-baselines3" /><summary type="html"><![CDATA[강화학습을 좀 더 쉽게 할 수 있도록 도와주는 라이브러리인 stable-baselines3 를 활용하여 CarRacing 환경을 학습해본다.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jonghyunho.github.io/posts/20220605/carracing_ppo.gif" /><media:content medium="image" url="https://jonghyunho.github.io/posts/20220605/carracing_ppo.gif" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">서울의 야경, 우면산</title><link href="https://jonghyunho.github.io/photo/%EC%9A%B0%EB%A9%B4%EC%82%B0.html" rel="alternate" type="text/html" title="서울의 야경, 우면산" /><published>2021-10-31T00:00:00+09:00</published><updated>2021-10-31T00:00:00+09:00</updated><id>https://jonghyunho.github.io/photo/%EC%9A%B0%EB%A9%B4%EC%82%B0</id><content type="html" xml:base="https://jonghyunho.github.io/photo/%EC%9A%B0%EB%A9%B4%EC%82%B0.html"><![CDATA[<p>10월의 마지막 날 서울의 야경이 보고 싶어 우면산 정상 소망탑에 올랐다.</p>

<p>이미 쌀쌀해져 가을이 지나가고 있다.</p>

<p><img src="/assets/img/posts/20211031/IMG_5915.JPG" alt="photo" /></p>

<p><img src="/assets/img/posts/20211031/IMG_5916.JPG" alt="photo" /></p>

<p><img src="/assets/img/posts/20211031/IMG_5926.JPG" alt="photo" /></p>

<p><img src="/assets/img/posts/20211031/IMG_5929.JPG" alt="photo" /></p>

<p><img src="/assets/img/posts/20211031/IMG_5940.JPG" alt="photo" /></p>

<p><img src="/assets/img/posts/20211031/IMG_5943.JPG" alt="photo" /></p>

<p><img src="/assets/img/posts/20211031/IMG_5945.JPG" alt="photo" /></p>

<p><img src="/assets/img/posts/20211031/IMG_5948.JPG" alt="photo" /></p>]]></content><author><name>Jonghyun Ho</name></author><category term="Photo" /><category term="우면산" /><category term="서울" /><category term="야경" /><summary type="html"><![CDATA[10월의 마지막 날 서울의 야경이 보고 싶어 우면산 정상 소망탑에 올랐다.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jonghyunho.github.io/posts/20211031/IMG_5940_thumbnail.jpg" /><media:content medium="image" url="https://jonghyunho.github.io/posts/20211031/IMG_5940_thumbnail.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">경기 선행 지수와 코스피 지수</title><link href="https://jonghyunho.github.io/data/analysis/composite-leading-indicator-and-kospi-copy.html" rel="alternate" type="text/html" title="경기 선행 지수와 코스피 지수" /><published>2020-08-18T00:00:00+09:00</published><updated>2020-08-18T00:00:00+09:00</updated><id>https://jonghyunho.github.io/data/analysis/composite-leading-indicator-and-kospi%20copy</id><content type="html" xml:base="https://jonghyunho.github.io/data/analysis/composite-leading-indicator-and-kospi-copy.html"><![CDATA[<p><code class="language-plaintext highlighter-rouge">경기 선행 지수</code>의 추세 방향을 알면 경제의 순환 구조를 이해할 수 있을까?</p>

<p>이는 주가에 어떤 영향을 미치는지 확인해보려고 한다.</p>

<h2 id="경기-선행-지수">경기 선행 지수</h2>

<p><code class="language-plaintext highlighter-rouge">경기 선행 지수</code>는 각 국가별, 지역별로 6~9개월 뒤 경기흐름을 예측하는 지수로, 개별 국가 및 지역의 경기 전환점 예측을 위해 이용된다.</p>

<p><code class="language-plaintext highlighter-rouge">OECD</code>와 <code class="language-plaintext highlighter-rouge">통계청</code>에서 <code class="language-plaintext highlighter-rouge">경기 선행 지수</code>를 발표하고 있는데, 각 기관에서 산출하는 계산 방식에는 약간의 차이가 있다.</p>

<p><code class="language-plaintext highlighter-rouge">통계청 경기 선행 지수</code>의 경우 총 9개의 변수(구인구직비율, 재고순환지표, 소비자기대지수, 기계류 내수출하지수, 건설수주액, 코스피지수, 장단기금리차, 원자재지수, 수출입물가비율)를 이용하는 반면, <code class="language-plaintext highlighter-rouge">OECD 경기 선행 지수</code>에서 우리나라 지수는 6개의 변수(업황, 코스피 지수, 재고순환지표, 재고량, 장단기 금리차(3년물-1일물 금리), 순교역조건)만을 이용하고 있다.</p>

<p><a href="http://www.index.go.kr/potal/main/EachDtlPageDetail.do?idx_cd=1057">통계청에서 발표하는 경기선행지수</a>는 현재 8월 기준으로 최신 데이터는 6월이다. 이는 2개월 전의 데이터로 활용성이 다소 떨어진다.</p>

<p><code class="language-plaintext highlighter-rouge">OECD</code> 에서 발표하는 <code class="language-plaintext highlighter-rouge">경기 선행 지수</code>는 현재 7월까지 정보가 업데이트 되어 있어 활용성이 상대적으로 높다.</p>

<p>이러한 이유로 <code class="language-plaintext highlighter-rouge">OECD</code>의 산출 데이터를 활용하려고 한다.</p>

<h4 id="출처">출처</h4>

<ul>
  <li>
    <p><a href="https://terms.naver.com/entry.nhn?docId=3534606&amp;cid=40942&amp;categoryId=31906">OECD 경기선행지수</a></p>
  </li>
  <li>
    <p><a href="https://kostat.go.kr/understand/info/info_lge/1/detail_lang.action?bmode=detail_lang&amp;pageNo=1&amp;keyWord=7&amp;cd=SL4428&amp;sTt=">선행지수 순환변동치</a></p>
  </li>
</ul>

<h2 id="경기-선행-지수-데이터-얻기">경기 선행 지수 데이터 얻기</h2>

<p>필요한 라이브러리를 선언한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>
<span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">json</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="nn">yfinance</span> <span class="k">as</span> <span class="n">yf</span>

<span class="n">plt</span><span class="p">.</span><span class="n">rcParams</span><span class="p">[</span><span class="s">"figure.figsize"</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">rcParams</span><span class="p">[</span><span class="s">'lines.linewidth'</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">plt</span><span class="p">.</span><span class="n">rcParams</span><span class="p">[</span><span class="s">'lines.color'</span><span class="p">]</span> <span class="o">=</span> <span class="s">'b'</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">getCLI</code> 함수는 <code class="language-plaintext highlighter-rouge">OECD</code> 에서 json 포맷의 <code class="language-plaintext highlighter-rouge">경기 선행 지수</code> 데이터를 얻을 수 있다.</p>

<p><code class="language-plaintext highlighter-rouge">country</code> 코드는 한국의 경우 <code class="language-plaintext highlighter-rouge">KOR</code>, 미국의 경우 <code class="language-plaintext highlighter-rouge">USA</code> 를 입력으로 받는다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Composite Leading Indicator
# https://data.oecd.org/leadind/composite-leading-indicator-cli.htm
</span><span class="k">def</span> <span class="nf">getCLI</span><span class="p">(</span><span class="n">country</span><span class="p">):</span>
    <span class="n">uri</span> <span class="o">=</span> <span class="s">'https://stats.oecd.org/sdmx-json/data/DP_LIVE/'</span> <span class="o">+</span> <span class="n">country</span> <span class="o">+</span> <span class="s">'.CLI.AMPLITUD.LTRENDIDX.M/OECD?json-lang=en&amp;dimensionAtObservation=allDimensions&amp;startPeriod=2005-01&amp;endPeriod=2020-12'</span>
    <span class="n">resp</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">uri</span><span class="p">)</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">resp</span><span class="p">.</span><span class="n">text</span><span class="p">)</span>

    <span class="n">dates</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">cli</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">cli_code</span> <span class="o">=</span> <span class="s">'CLI.'</span> <span class="o">+</span> <span class="n">country</span>

    <span class="n">observations</span> <span class="o">=</span> <span class="n">result</span><span class="p">[</span><span class="s">'dataSets'</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s">'observations'</span><span class="p">]</span>
    <span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">observations</span><span class="p">:</span>
        <span class="n">obs</span> <span class="o">=</span> <span class="n">observations</span><span class="p">[</span><span class="n">key</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span>
        <span class="n">cli</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">obs</span><span class="p">)</span>

    <span class="n">time_period</span> <span class="o">=</span> <span class="n">result</span><span class="p">[</span><span class="s">'structure'</span><span class="p">][</span><span class="s">'dimensions'</span><span class="p">][</span><span class="s">'observation'</span><span class="p">][</span><span class="mi">5</span><span class="p">][</span><span class="s">'values'</span><span class="p">]</span>
    <span class="k">for</span> <span class="n">date</span> <span class="ow">in</span> <span class="n">time_period</span><span class="p">:</span>
        <span class="n">date</span> <span class="o">=</span> <span class="n">date</span><span class="p">[</span><span class="s">'id'</span><span class="p">]</span>
        <span class="n">year</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">date</span><span class="p">[:</span><span class="mi">4</span><span class="p">])</span>
        <span class="n">month</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">date</span><span class="p">[</span><span class="mi">5</span><span class="p">:</span><span class="mi">7</span><span class="p">])</span>
        <span class="n">dates</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">datetime</span><span class="p">(</span><span class="n">year</span><span class="o">=</span><span class="n">year</span><span class="p">,</span> <span class="n">month</span><span class="o">=</span><span class="n">month</span><span class="p">,</span> <span class="n">day</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>

    <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">cli</span><span class="p">,</span> <span class="n">index</span><span class="o">=</span><span class="n">dates</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="n">cli_code</span><span class="p">])</span>
    <span class="k">return</span> <span class="n">df</span>
</code></pre></div></div>

<h2 id="주가-데이터-얻을-수-있는-함수">주가 데이터 얻을 수 있는 함수</h2>

<p><code class="language-plaintext highlighter-rouge">경기 선행 지수</code>와의 비교를 얻기 위해 주가 데이터를 얻는다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">GetYahooFinance</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">code</span><span class="p">):</span>
    <span class="n">ticker</span> <span class="o">=</span> <span class="n">yf</span><span class="p">.</span><span class="n">Ticker</span><span class="p">(</span><span class="n">code</span><span class="p">)</span>
    <span class="n">ticker</span> <span class="o">=</span> <span class="n">ticker</span><span class="p">.</span><span class="n">history</span><span class="p">(</span><span class="n">period</span><span class="o">=</span><span class="s">'16y'</span><span class="p">)</span>
    <span class="n">ticker</span> <span class="o">=</span> <span class="n">ticker</span><span class="p">[[</span><span class="s">'Close'</span><span class="p">]]</span>
    <span class="n">ticker</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="p">{</span><span class="s">'Close'</span><span class="p">:</span> <span class="n">name</span><span class="p">},</span> <span class="n">inplace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">ticker</span>
</code></pre></div></div>

<h2 id="경기-선행-지수와-주가-지수의-비교-시각화">경기 선행 지수와 주가 지수의 비교, 시각화</h2>

<p>한국과 미국 두 데이터를 함께 살펴본다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">code_names</span> <span class="o">=</span> <span class="p">[(</span><span class="s">'KOR'</span><span class="p">,</span> <span class="s">'Kospi'</span><span class="p">,</span> <span class="s">'^KS11'</span><span class="p">),</span>
              <span class="p">(</span><span class="s">'USA'</span><span class="p">,</span> <span class="s">'Nasdaq'</span><span class="p">,</span> <span class="s">'^DJI'</span><span class="p">)]</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">경기 선행 지수</code>와 주가 데이터를 얻어 하나의 데이터 프레임으로 합친다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="n">country_code</span><span class="p">,</span> <span class="n">yf_name</span><span class="p">,</span> <span class="n">yf_code</span> <span class="ow">in</span> <span class="n">code_names</span><span class="p">:</span>
    <span class="n">cli_code</span> <span class="o">=</span> <span class="s">'CLI.'</span> <span class="o">+</span> <span class="n">country_code</span>
    <span class="n">cli</span> <span class="o">=</span> <span class="n">getCLI</span><span class="p">(</span><span class="n">country_code</span><span class="p">)</span>

    <span class="n">ticker</span> <span class="o">=</span> <span class="n">GetYahooFinance</span><span class="p">(</span><span class="n">yf_name</span><span class="p">,</span> <span class="n">yf_code</span><span class="p">)</span>

    <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">concat</span><span class="p">([</span><span class="n">cli</span><span class="p">[</span><span class="n">cli_code</span><span class="p">],</span> <span class="n">ticker</span><span class="p">[</span><span class="n">yf_name</span><span class="p">]],</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
    <span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">interpolate</span><span class="p">(</span><span class="n">limit_direction</span><span class="o">=</span><span class="s">'backward'</span><span class="p">)</span>
</code></pre></div></div>

<p>그래프로 시각화를 하는데 좌 축은 <code class="language-plaintext highlighter-rouge">경기 선행 지수</code>를, 우 축은 주가 정보를 표시하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">figure</span><span class="p">()</span>
    <span class="n">ax</span> <span class="o">=</span> <span class="n">fig</span><span class="p">.</span><span class="n">add_subplot</span><span class="p">(</span><span class="mi">111</span><span class="p">)</span>
    <span class="n">df</span><span class="p">[</span><span class="n">cli_code</span><span class="p">].</span><span class="n">plot</span><span class="p">(</span><span class="n">ax</span><span class="o">=</span><span class="n">ax</span><span class="p">)</span>
    <span class="n">ax2</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">twinx</span><span class="p">()</span>
    <span class="n">df</span><span class="p">[</span><span class="n">yf_name</span><span class="p">].</span><span class="n">plot</span><span class="p">(</span><span class="n">ax</span><span class="o">=</span><span class="n">ax2</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'orangered'</span><span class="p">)</span>
    <span class="n">fig</span><span class="p">.</span><span class="n">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s">"upper left"</span><span class="p">,</span> <span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">bbox_transform</span><span class="o">=</span><span class="n">ax</span><span class="p">.</span><span class="n">transAxes</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">gradient</code> 변수에 <code class="language-plaintext highlighter-rouge">경기 선행 지수</code>의 기울기를 저장한다.</p>

<p>투자의 관점에서 주의해야 할 구간을 표시하기 위해 <code class="language-plaintext highlighter-rouge">warnings</code> 변수에는 지수의 기울기가 감소하는 구간만 저장하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">gradient</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">gradient</span><span class="p">(</span><span class="n">df</span><span class="p">[</span><span class="n">cli_code</span><span class="p">])</span>
    <span class="n">warnings</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">Series</span><span class="p">(</span><span class="n">gradient</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">,</span> <span class="n">index</span><span class="o">=</span><span class="n">df</span><span class="p">.</span><span class="n">index</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">warnings</code>로 계산된 영역을 계산하여 그래프에 회색 음영처리를 하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">range_list</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">prev_val</span> <span class="o">=</span> <span class="bp">False</span>
    <span class="k">for</span> <span class="n">index</span><span class="p">,</span> <span class="n">value</span> <span class="ow">in</span> <span class="n">warnings</span><span class="p">.</span><span class="n">iteritems</span><span class="p">():</span>
        <span class="k">if</span> <span class="n">prev_val</span> <span class="o">!=</span> <span class="n">value</span><span class="p">:</span>
            <span class="k">if</span> <span class="n">value</span><span class="p">:</span>
                <span class="n">begin</span> <span class="o">=</span> <span class="n">index</span>
            <span class="k">else</span><span class="p">:</span>
                <span class="n">range_list</span><span class="p">.</span><span class="n">append</span><span class="p">((</span><span class="n">begin</span><span class="p">,</span> <span class="n">index</span><span class="p">))</span>

        <span class="n">prev_inx</span> <span class="o">=</span> <span class="n">index</span>
        <span class="n">prev_val</span> <span class="o">=</span> <span class="n">value</span>

    <span class="k">for</span> <span class="p">(</span><span class="n">begin</span><span class="p">,</span> <span class="n">end</span><span class="p">)</span> <span class="ow">in</span> <span class="n">range_list</span><span class="p">:</span>
        <span class="n">plt</span><span class="p">.</span><span class="n">axvspan</span><span class="p">(</span><span class="n">begin</span><span class="p">,</span> <span class="n">end</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'gray'</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span>

    <span class="n">plt</span><span class="p">.</span><span class="n">grid</span><span class="p">()</span>
    <span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p>아래는 <code class="language-plaintext highlighter-rouge">한국의 경기 선행 지수</code> 와 <code class="language-plaintext highlighter-rouge">코스피</code> 지수를 함께 시각화 한 결과이다.</p>

<p><img src="/assets/img/posts/20200818/cli-and-kospi.png" alt="CLI and Kospi" /></p>

<p>또한, <code class="language-plaintext highlighter-rouge">미국의 경기 선행 지수</code> 와 <code class="language-plaintext highlighter-rouge">나스닥</code> 지수를 함께 표시한 결과이다.</p>

<p><img src="/assets/img/posts/20200818/cli-and-nasdaq.png" alt="CLI and Nasdaq" /></p>

<p>위의 그래프에서 살펴보면 <code class="language-plaintext highlighter-rouge">경기 선행 지수</code>의 기울기가 상승하면서 하락으로 기울기가 전환되는 지점(음영 처리된 구간) 이후에는 주가도 함께 하락하는 경향을 확인할 수 있었다.</p>

<p>하지만 코스피의 2010년과 나스닥의 2019년의 경우를 보면 선행 지수가 감소하기 시작한 이후에도 1년 이상 주가는 상승하는 모습을 보이는 경우도 있었기 때문에, 100% 일치한다고 볼 수는 없지만 주의의 관점에서 지켜볼 필요는 있을 것 같다.</p>

<h2 id="결론">결론</h2>

<p>이 지수에는 두 가지 함정이 있는 것으로 보인다.</p>

<p>첫번째는 <strong>선행</strong> 지수인데 이 수치에 대한 발표가 한두달 늦어진다는 점이어서 선행이라는 점의 메리트가 떨어지는 것 같다.</p>

<p>그럼에도 불구하고, 이 축을 한달 정도 미루어 이동해서 보아도 어느 정도 위의 규칙은 성립하는 것으로 보인다.</p>

<p>두번째는 <code class="language-plaintext highlighter-rouge">경기 선행 지수</code> 자체에 <code class="language-plaintext highlighter-rouge">코스피 지수</code> 정보를 포함하고 있다는 점이다. 그래서 함께 움직일 수 밖에 없는 것이다.</p>

<pre>
OECD 경기 선행 지수에서 우리나라 지수는 6개의 변수(업황, 코스피 지수, 재고순환지표, 재고량, 장단기 금리차(3년물-1일물 금리), 순교역조건)만을 이용하고 있다.
</pre>

<p>그렇기에 <code class="language-plaintext highlighter-rouge">장단기 금리차</code>와 같은 지표를 별도로 분리해서 확인해 볼 필요도 있을 것 같다.</p>

<p>최근 지표에서는 <code class="language-plaintext highlighter-rouge">경기 선행 지수</code>의 하락은 코로나 바이러스 확산 때문이 아니라 그 이전부터 진행이 되어오고 있었고, 2019년 후반부터 현재까지 다시 상승하고 있는 추세이다.</p>

<p>아직 지표가 상승 전환한 초반인데 반해 주가 상승폭이 높아서 방향성은 일치하지만 두 수치가 어떠한 형태로 수렴하게 될 지 궁금해지는 시점이다.</p>]]></content><author><name>Jonghyun Ho</name></author><category term="Data" /><category term="Analysis" /><category term="Crawling" /><category term="Python" /><category term="CLI" /><category term="경기선행지수" /><category term="Kospi" /><category term="코스피" /><category term="Nasdaq" /><category term="나스닥" /><summary type="html"><![CDATA[경기 선행 지수의 추세 방향을 알면 경제의 순환 구조를 이해할 수 있을까?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jonghyunho.github.io/posts/20200818/cli-and-kospi.png" /><media:content medium="image" url="https://jonghyunho.github.io/posts/20200818/cli-and-kospi.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">나스닥과 공포 심리의 관계</title><link href="https://jonghyunho.github.io/data/analysis/mental-state-of-fear-and-nasdaq.html" rel="alternate" type="text/html" title="나스닥과 공포 심리의 관계" /><published>2020-07-25T00:00:00+09:00</published><updated>2020-07-25T00:00:00+09:00</updated><id>https://jonghyunho.github.io/data/analysis/mental-state-of-fear-and-nasdaq</id><content type="html" xml:base="https://jonghyunho.github.io/data/analysis/mental-state-of-fear-and-nasdaq.html"><![CDATA[<p>현재 미국 내 코로나 바이러스 일 신규 확진자 수는 7만명을 넘어서고 있고 누적 확진자 수는 400만명을 넘고 있다.</p>

<p>코로나 바이러스의 공포 심리로 인해 미국 증시는 3월에 큰 폭으로 하락하였지만 그 이후 현재 증시는 어느 정도 회복되었다.</p>

<p>사실 확진자 수는 3월보다 현재가 더 많은 상황이지만 증시는 더 이상 하락하지 않고 다시 상승하고 있다.</p>

<p>증시가 신규 확진자 수와 관련이 있다면 3월보다 현재 지수가 더 하락해야 하지만 상승한다는 것은 실제 확진자 수와는 크게 관련이 깊지는 않은 것처럼 보인다.</p>

<p>코로나 바이러스가 장기화 될 것이라고 생각되고, 생활 방역으로 경제 활동을 지속하는 것이 뉴노멀이 되어 바이러스 발생 초기 만큼의 공포 심리를 가지고 있지 않아서일까?</p>

<p>이러한 관점에서 증시와 공포 심리는 어떤 관계를 갖고 있을지 살펴보려고 한다.</p>

<p>공포 심리를 파악하기 위해 미국 일 신규 확진자 수, 구글 트렌드의 <code class="language-plaintext highlighter-rouge">covid</code> 검색 결과, <code class="language-plaintext highlighter-rouge">VIX</code> 지수 데이터 등을 이용하고 증시는 나스닥 데이터를 이용하고자 한다.</p>

<h2 id="미국-일-신규-확진자-수">미국 일 신규 확진자 수</h2>

<p>미국 일 신규 확진자 수는 <a href="https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html">미국 질병통제예방센터</a> 에서 데이터를 확인할 수 있다.</p>

<p><img src="/assets/img/posts/20200725/us_new_cases_by_day.png" alt="US New Cases by Day" /></p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">json</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">from</span> <span class="nn">dateutil.parser</span> <span class="kn">import</span> <span class="n">parse</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>

<span class="k">def</span> <span class="nf">getUSCovidNewCases</span><span class="p">():</span>
    <span class="n">req</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="s">'https://www.cdc.gov/coronavirus/2019-ncov/json/new-cases-chart-data.json'</span><span class="p">)</span>
    <span class="n">new_cases</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">req</span><span class="p">.</span><span class="n">text</span><span class="p">)</span>

    <span class="n">dates</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">new_cases</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">:])</span>
    <span class="n">num_cases</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">new_cases</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="mi">1</span><span class="p">:],</span> <span class="n">dtype</span><span class="o">=</span><span class="nb">float</span><span class="p">)</span>

    <span class="n">series</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">Series</span><span class="p">(</span><span class="n">dtype</span><span class="o">=</span><span class="nb">float</span><span class="p">)</span>
    <span class="n">series</span><span class="p">.</span><span class="n">index</span><span class="p">.</span><span class="n">name</span><span class="o">=</span><span class="s">'date'</span>
    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">date</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">dates</span><span class="p">):</span>
        <span class="n">date</span> <span class="o">=</span> <span class="n">parse</span><span class="p">(</span><span class="n">date</span><span class="p">)</span>
        <span class="n">num_case</span> <span class="o">=</span> <span class="n">num_cases</span><span class="p">[</span><span class="n">i</span><span class="p">]</span>
        <span class="n">series</span><span class="p">[</span><span class="n">date</span><span class="p">]</span> <span class="o">=</span> <span class="n">num_case</span>

    <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">series</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s">'US Covid New Cases'</span><span class="p">])</span>
    <span class="n">df</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">index</span><span class="o">=</span><span class="p">{</span><span class="s">''</span><span class="p">:</span> <span class="s">'date'</span><span class="p">})</span>
    <span class="k">return</span> <span class="n">df</span>
	
<span class="n">df_covid_new_cases</span> <span class="o">=</span> <span class="n">getUSCovidNewCases</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">df_covid_new_cases</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">getUSCovidNewCases</code> 함수의 실행 결과는 다음과 같다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>            <span class="n">US</span> <span class="n">Covid</span> <span class="n">New</span> <span class="n">Cases</span>
<span class="n">date</span>                          
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">22</span>                 <span class="mf">1.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">23</span>                 <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">24</span>                 <span class="mf">1.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">25</span>                 <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">26</span>                 <span class="mf">3.0</span>
<span class="p">...</span>                        <span class="p">...</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">19</span>             <span class="mf">63201.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">20</span>             <span class="mf">57777.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">21</span>             <span class="mf">63028.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">22</span>             <span class="mf">70106.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">23</span>             <span class="mf">72219.0</span>

<span class="p">[</span><span class="mi">184</span> <span class="n">rows</span> <span class="n">x</span> <span class="mi">1</span> <span class="n">columns</span><span class="p">]</span>
</code></pre></div></div>

<p>이를 그래프로 시각화 해보자.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>df_covid_new_cases.plot()
plt.grid()
plt.show()
</code></pre></div></div>

<p><img src="/assets/img/posts/20200725/crawling_us_new_cases_by_day.png" alt="Crawling US new cases by day" /></p>

<h2 id="구글-트렌드의-covid-검색-결과">구글 트렌드의 <code class="language-plaintext highlighter-rouge">covid</code> 검색 결과</h2>

<p>아래는 <a href="https://trends.google.com/trends/explore?q=covid&amp;geo=US">Google Trends</a> 에서 <code class="language-plaintext highlighter-rouge">covid</code> 를 검색한 결과이다.</p>

<p><img src="/assets/img/posts/20200725/google_trends_us_covid.png" alt="Google Trends US Covid" /></p>

<p>최근 일 신규 확진자 수는 지속적으로 늘어나고 있는 반면, <code class="language-plaintext highlighter-rouge">covid</code> 검색량은 오히려 증가하다가 최근에는 다시 줄어들고 있는 추세를 보인다.</p>

<p>검색 엔진에서 <code class="language-plaintext highlighter-rouge">covid</code> 단어를 검색한 빈도는 실제 일 신규 확진자 수와 유사하겠지만 어느 정도 사람들의 심리가 반영되지 않았을까 하는 추측을 해본다.</p>

<p>관심이 있는 만큼 검색을 해보지 않을까 싶어서이다.</p>

<p>얼마나 큰 상관관계를 가지고 있는지 파악하기 위해 이 데이터도 함께 분석에 사용한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">pytrends.request</span> <span class="kn">import</span> <span class="n">TrendReq</span>

<span class="k">def</span> <span class="nf">getGoogleTrend</span><span class="p">(</span><span class="n">keyword</span><span class="p">,</span> <span class="n">column_name</span><span class="p">):</span>
    <span class="n">request</span> <span class="o">=</span> <span class="n">TrendReq</span><span class="p">(</span><span class="n">hl</span><span class="o">=</span><span class="s">'en-US'</span><span class="p">,</span> <span class="n">tz</span><span class="o">=</span><span class="mi">360</span><span class="p">)</span>
    <span class="n">kw_list</span> <span class="o">=</span> <span class="p">[</span><span class="n">keyword</span><span class="p">]</span>
    <span class="n">request</span><span class="p">.</span><span class="n">build_payload</span><span class="p">(</span>
         <span class="n">kw_list</span><span class="o">=</span><span class="n">kw_list</span><span class="p">,</span>
         <span class="n">cat</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
         <span class="n">timeframe</span><span class="o">=</span><span class="s">'today 12-m'</span><span class="p">,</span>
         <span class="n">geo</span><span class="o">=</span><span class="s">''</span><span class="p">,</span>
         <span class="n">gprop</span><span class="o">=</span><span class="s">''</span><span class="p">)</span>
    <span class="n">df</span> <span class="o">=</span> <span class="n">request</span><span class="p">.</span><span class="n">interest_over_time</span><span class="p">()</span>
    <span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">[[</span><span class="n">keyword</span><span class="p">]].</span><span class="n">astype</span><span class="p">(</span><span class="nb">float</span><span class="p">)</span>
    <span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="p">{</span><span class="n">keyword</span><span class="p">:</span> <span class="n">column_name</span><span class="p">})</span>
    <span class="k">return</span> <span class="n">df</span>

<span class="n">df_covid_trend</span> <span class="o">=</span> <span class="n">getGoogleTrend</span><span class="p">(</span><span class="s">'covid'</span><span class="p">,</span> <span class="s">'US Covid Trend'</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">df_covid_trend</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">getGoogleTrend</code> 함수를 실행한 결과는 다음과 같다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>            <span class="n">US</span> <span class="n">Covid</span> <span class="n">Trend</span>
<span class="n">date</span>                      
<span class="mi">2019</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">28</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">04</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">11</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">18</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">25</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">01</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">08</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">15</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">22</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">29</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">06</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">13</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">20</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">27</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">03</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">10</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">17</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">24</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">01</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">08</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">15</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">22</span>             <span class="mf">0.0</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">29</span>             <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">05</span>             <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">12</span>             <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">19</span>             <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">26</span>             <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">02</span>             <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">09</span>             <span class="mf">1.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">16</span>             <span class="mf">1.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">23</span>             <span class="mf">4.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">01</span>             <span class="mf">8.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">08</span>            <span class="mf">31.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">15</span>            <span class="mf">78.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">22</span>           <span class="mf">100.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">29</span>            <span class="mf">93.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">05</span>            <span class="mf">86.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">12</span>            <span class="mf">79.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">19</span>            <span class="mf">71.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">26</span>            <span class="mf">68.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">03</span>            <span class="mf">67.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">10</span>            <span class="mf">66.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">17</span>            <span class="mf">61.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">24</span>            <span class="mf">56.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">31</span>            <span class="mf">51.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">06</span><span class="o">-</span><span class="mi">07</span>            <span class="mf">51.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">06</span><span class="o">-</span><span class="mi">14</span>            <span class="mf">53.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">06</span><span class="o">-</span><span class="mi">21</span>            <span class="mf">59.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">06</span><span class="o">-</span><span class="mi">28</span>            <span class="mf">62.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">05</span>            <span class="mf">63.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">12</span>            <span class="mf">67.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">19</span>            <span class="mf">65.0</span>
</code></pre></div></div>

<p>이를 그래프로 시각화 해보자.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df_covid_trend</span><span class="p">.</span><span class="n">plot</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">grid</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/img/posts/20200725/crawling_google_trends_us_covid.png" alt="Crawling Google Trends US Covid" /></p>

<p>그리고 <code class="language-plaintext highlighter-rouge">df</code> 라는 이름으로 두 개의 데이터 프레임을 합친다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">df_covid_new_cases</span><span class="p">.</span><span class="n">merge</span><span class="p">(</span><span class="n">df_covid_trend</span><span class="p">,</span> <span class="n">on</span><span class="o">=</span><span class="s">'date'</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
</code></pre></div></div>

<p>다음은 실행 결과이다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>            <span class="n">US</span> <span class="n">Covid</span> <span class="n">New</span> <span class="n">Cases</span>  <span class="n">US</span> <span class="n">Covid</span> <span class="n">Trend</span>
<span class="n">date</span>                                          
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">26</span>                 <span class="mf">3.0</span>             <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">02</span>                 <span class="mf">0.0</span>             <span class="mf">0.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">09</span>                 <span class="mf">0.0</span>             <span class="mf">1.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">16</span>                 <span class="mf">0.0</span>             <span class="mf">1.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">23</span>                 <span class="mf">0.0</span>             <span class="mf">4.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">01</span>                 <span class="mf">6.0</span>             <span class="mf">8.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">08</span>               <span class="mf">147.0</span>            <span class="mf">31.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">15</span>              <span class="mf">1237.0</span>            <span class="mf">77.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">22</span>              <span class="mf">8821.0</span>           <span class="mf">100.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">29</span>             <span class="mf">18251.0</span>            <span class="mf">92.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">05</span>             <span class="mf">26065.0</span>            <span class="mf">86.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">12</span>             <span class="mf">29145.0</span>            <span class="mf">78.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">19</span>             <span class="mf">25995.0</span>            <span class="mf">71.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">26</span>             <span class="mf">29256.0</span>            <span class="mf">68.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">03</span>             <span class="mf">29763.0</span>            <span class="mf">64.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">10</span>             <span class="mf">23792.0</span>            <span class="mf">65.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">17</span>             <span class="mf">13284.0</span>            <span class="mf">59.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">24</span>             <span class="mf">15342.0</span>            <span class="mf">55.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">31</span>             <span class="mf">26177.0</span>            <span class="mf">50.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">06</span><span class="o">-</span><span class="mi">07</span>             <span class="mf">17919.0</span>            <span class="mf">51.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">06</span><span class="o">-</span><span class="mi">14</span>             <span class="mf">21957.0</span>            <span class="mf">52.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">06</span><span class="o">-</span><span class="mi">21</span>             <span class="mf">27616.0</span>            <span class="mf">59.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">06</span><span class="o">-</span><span class="mi">28</span>             <span class="mf">41390.0</span>            <span class="mf">61.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">05</span>             <span class="mf">44361.0</span>            <span class="mf">62.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">12</span>             <span class="mf">60469.0</span>            <span class="mf">66.0</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">19</span>             <span class="mf">63201.0</span>            <span class="mf">63.0</span>
</code></pre></div></div>

<h2 id="vix">VIX</h2>

<p><code class="language-plaintext highlighter-rouge">VIX(Volatility Index)</code>는 시카고옵션거래소에 상장된 S&amp;P 500 지수옵션의 향후 30일간의 변동성에 대한 시장의 기대를 나타내는 지수로, 증시 지수와는 반대로 움직이는 특징이 있다.</p>

<p>예를 들어, VIX지수가 최고치에 이른다는 것은 투자자들의 불안 심리가 극에 달했다는 것으로 주식시장에서 팔 사람은 모두 팔아 치우게 돼 지수가 반등 여지를 마련했다는 것을 의미한다.</p>

<p>‘공포지수’라고도 불린다.</p>

<p>출처 : <a href="https://terms.naver.com/entry.nhn?docId=5698336&amp;cid=43659&amp;categoryId=43659">VIX</a></p>

<p><code class="language-plaintext highlighter-rouge">Yahoo Finance</code>를 이용하여 나스닥 지수와 VIX 지수를 얻을 수 있다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">yfinance</span> <span class="k">as</span> <span class="n">yf</span>

<span class="k">def</span> <span class="nf">GetYahooFinance</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">code</span><span class="p">):</span>
    <span class="n">ticker</span> <span class="o">=</span> <span class="n">yf</span><span class="p">.</span><span class="n">Ticker</span><span class="p">(</span><span class="n">code</span><span class="p">)</span>
    <span class="n">ticker</span> <span class="o">=</span> <span class="n">ticker</span><span class="p">.</span><span class="n">history</span><span class="p">(</span><span class="n">period</span><span class="o">=</span><span class="s">'1y'</span><span class="p">)</span>
    <span class="n">ticker</span> <span class="o">=</span> <span class="n">ticker</span><span class="p">[[</span><span class="s">'Close'</span><span class="p">]]</span>
    <span class="n">ticker</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="p">{</span><span class="s">'Close'</span><span class="p">:</span> <span class="n">name</span><span class="p">},</span> <span class="n">inplace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">ticker</span><span class="p">.</span><span class="n">index</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="s">'date'</span>
    <span class="k">return</span> <span class="n">ticker</span>
</code></pre></div></div>

<p>나스닥 지수의 데이터 프레임 생성</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df_nasdaq</span> <span class="o">=</span> <span class="n">GetYahooFinance</span><span class="p">(</span><span class="s">'Nasdaq'</span><span class="p">,</span> <span class="s">'^DJI'</span><span class="p">)</span>
<span class="n">df_nasdaq</span> <span class="o">=</span> <span class="n">df_nasdaq</span><span class="p">[</span><span class="n">df_nasdaq</span><span class="p">.</span><span class="n">index</span> <span class="o">&gt;=</span> <span class="n">df</span><span class="p">.</span><span class="n">index</span><span class="p">[</span><span class="mi">0</span><span class="p">]]</span>
<span class="k">print</span><span class="p">(</span><span class="n">df_nasdaq</span><span class="p">)</span>
</code></pre></div></div>

<p>나스닥 지수 데이터 결과</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span><span class="mi">126</span> <span class="n">rows</span> <span class="n">x</span> <span class="mi">1</span> <span class="n">columns</span><span class="p">]</span>
              <span class="n">VIX</span>
<span class="n">date</span>             
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">27</span>  <span class="mf">18.23</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">28</span>  <span class="mf">16.28</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">29</span>  <span class="mf">16.39</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">30</span>  <span class="mf">15.49</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">31</span>  <span class="mf">18.84</span>
<span class="p">...</span>           <span class="p">...</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">20</span>  <span class="mf">24.46</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">21</span>  <span class="mf">24.84</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">22</span>  <span class="mf">24.32</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">23</span>  <span class="mf">26.08</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">24</span>  <span class="mf">25.84</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">VIX</code> 지수의 데이터 프레임 생성</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df_vix</span> <span class="o">=</span> <span class="n">GetYahooFinance</span><span class="p">(</span><span class="s">'VIX'</span><span class="p">,</span> <span class="s">'^VIX'</span><span class="p">)</span>
<span class="n">df_vix</span> <span class="o">=</span> <span class="n">df_vix</span><span class="p">[</span><span class="n">df_vix</span><span class="p">.</span><span class="n">index</span> <span class="o">&gt;=</span> <span class="n">df</span><span class="p">.</span><span class="n">index</span><span class="p">[</span><span class="mi">0</span><span class="p">]]</span>
<span class="k">print</span><span class="p">(</span><span class="n">df_vix</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">VIX</code> 지수 데이터 결과</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>              <span class="n">Nasdaq</span>
<span class="n">date</span>                
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">27</span>  <span class="mf">28535.80</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">28</span>  <span class="mf">28722.85</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">29</span>  <span class="mf">28734.45</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">30</span>  <span class="mf">28859.44</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">31</span>  <span class="mf">28256.03</span>
<span class="p">...</span>              <span class="p">...</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">20</span>  <span class="mf">26680.87</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">21</span>  <span class="mf">26840.40</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">22</span>  <span class="mf">27005.84</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">23</span>  <span class="mf">26652.33</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">24</span>  <span class="mf">26469.89</span>
</code></pre></div></div>

<h2 id="모든-데이터의-상관관계">모든 데이터의 상관관계</h2>

<p>미국 일 신규 확진자 수, 구글 트렌드의 <code class="language-plaintext highlighter-rouge">covid</code> 검색 결과, <code class="language-plaintext highlighter-rouge">VIX</code> 지수 데이터, 나스닥 지수 데이터 등을 모든 데이터를 합친다.</p>

<p>데이터 정규화를 하고, 누락된 데이터는 interpolation 을 통해 값을 채웠다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">concat</span><span class="p">([</span><span class="n">df_nasdaq</span><span class="p">[</span><span class="s">'Nasdaq'</span><span class="p">],</span> <span class="n">df</span><span class="p">[</span><span class="s">'US Covid New Cases'</span><span class="p">],</span> <span class="n">df</span><span class="p">[</span><span class="s">'US Covid Trend'</span><span class="p">],</span> <span class="n">df_vix</span><span class="p">[</span><span class="s">'VIX'</span><span class="p">]],</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">df</span> <span class="o">=</span> <span class="p">(</span><span class="n">df</span> <span class="o">-</span> <span class="n">df</span><span class="p">.</span><span class="n">mean</span><span class="p">())</span> <span class="o">/</span> <span class="n">df</span><span class="p">.</span><span class="n">std</span><span class="p">()</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">interpolate</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
</code></pre></div></div>

<p>모든 데이터를 합친 결과이다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>              <span class="n">Nasdaq</span>  <span class="n">US</span> <span class="n">Covid</span> <span class="n">New</span> <span class="n">Cases</span>  <span class="n">US</span> <span class="n">Covid</span> <span class="n">Trend</span>       <span class="n">VIX</span>
<span class="n">date</span>                                                              
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">26</span>       <span class="n">NaN</span>           <span class="o">-</span><span class="mf">1.102948</span>       <span class="o">-</span><span class="mf">1.673008</span>       <span class="n">NaN</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">27</span>  <span class="mf">1.344532</span>           <span class="o">-</span><span class="mf">1.102976</span>       <span class="o">-</span><span class="mf">1.673008</span> <span class="o">-</span><span class="mf">1.117156</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">28</span>  <span class="mf">1.420651</span>           <span class="o">-</span><span class="mf">1.103003</span>       <span class="o">-</span><span class="mf">1.673008</span> <span class="o">-</span><span class="mf">1.249661</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">29</span>  <span class="mf">1.425371</span>           <span class="o">-</span><span class="mf">1.103030</span>       <span class="o">-</span><span class="mf">1.673008</span> <span class="o">-</span><span class="mf">1.242187</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">30</span>  <span class="mf">1.476235</span>           <span class="o">-</span><span class="mf">1.103058</span>       <span class="o">-</span><span class="mf">1.673008</span> <span class="o">-</span><span class="mf">1.303343</span>
<span class="p">...</span>              <span class="p">...</span>                 <span class="p">...</span>             <span class="p">...</span>       <span class="p">...</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">20</span>  <span class="mf">0.589677</span>            <span class="mf">2.354868</span>        <span class="mf">0.396771</span> <span class="o">-</span><span class="mf">0.693818</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">21</span>  <span class="mf">0.654597</span>            <span class="mf">2.354868</span>        <span class="mf">0.396771</span> <span class="o">-</span><span class="mf">0.667996</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">22</span>  <span class="mf">0.721922</span>            <span class="mf">2.354868</span>        <span class="mf">0.396771</span> <span class="o">-</span><span class="mf">0.703331</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">23</span>  <span class="mf">0.578062</span>            <span class="mf">2.354868</span>        <span class="mf">0.396771</span> <span class="o">-</span><span class="mf">0.583736</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">24</span>  <span class="mf">0.503819</span>            <span class="mf">2.354868</span>        <span class="mf">0.396771</span> <span class="o">-</span><span class="mf">0.600044</span>

<span class="p">[</span><span class="mi">152</span> <span class="n">rows</span> <span class="n">x</span> <span class="mi">4</span> <span class="n">columns</span><span class="p">]</span>
</code></pre></div></div>

<p>이들 간의 상관관계를 계산해보자.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">print</span><span class="p">(</span><span class="n">df</span><span class="p">.</span><span class="n">corr</span><span class="p">())</span>
</code></pre></div></div>

<p>상관관계는 다음과 같다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                      <span class="n">Nasdaq</span>  <span class="n">US</span> <span class="n">Covid</span> <span class="n">New</span> <span class="n">Cases</span>  <span class="n">US</span> <span class="n">Covid</span> <span class="n">Trend</span>       <span class="n">VIX</span>
<span class="n">Nasdaq</span>              <span class="mf">1.000000</span>           <span class="o">-</span><span class="mf">0.080119</span>       <span class="o">-</span><span class="mf">0.845421</span> <span class="o">-</span><span class="mf">0.892276</span>
<span class="n">US</span> <span class="n">Covid</span> <span class="n">New</span> <span class="n">Cases</span> <span class="o">-</span><span class="mf">0.080119</span>            <span class="mf">1.000000</span>        <span class="mf">0.531151</span> <span class="o">-</span><span class="mf">0.087459</span>
<span class="n">US</span> <span class="n">Covid</span> <span class="n">Trend</span>     <span class="o">-</span><span class="mf">0.845421</span>            <span class="mf">0.531151</span>        <span class="mf">1.000000</span>  <span class="mf">0.689749</span>
<span class="n">VIX</span>                <span class="o">-</span><span class="mf">0.892276</span>           <span class="o">-</span><span class="mf">0.087459</span>        <span class="mf">0.689749</span>  <span class="mf">1.000000</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">Nasdaq</code> 지수와 <code class="language-plaintext highlighter-rouge">US Covid Trend</code> 구글 검색량의 관계는 <code class="language-plaintext highlighter-rouge">-0.84</code>로 강한 상관관계를 갖는다.</p>

<p><code class="language-plaintext highlighter-rouge">covid</code> 검색량이 늘어날수록 주가가 하락한다고 볼 수 있다.</p>

<p><code class="language-plaintext highlighter-rouge">Nasdaq</code> 지수와 <code class="language-plaintext highlighter-rouge">VIX</code> 지수 또한 상관관계가 <code class="language-plaintext highlighter-rouge">-0.89</code>로 검색량보다 더 큰 상관관계로 연관되어 있다.</p>

<p>이 또한, <code class="language-plaintext highlighter-rouge">VIX</code>로 대표되는 공포 심리가 증가할수록 증시가 하락한다고 볼 수 있다.</p>

<h2 id="데이터-시각화">데이터 시각화</h2>

<p>위에서 생성된 데이터프레임을 이용하여 정규화 과정을 거쳐 그래프로 시각화한 결과이다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span><span class="p">.</span><span class="n">plot</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">grid</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/img/posts/20200725/fear-and-stock.png" alt="Fear and Stock" /></p>

<p>모든 데이터가 하나로 합쳐지면 시각적으로 비교가 잘 되지 않아, 나스닥과 각 지표들간의 비교를 분리해보았다.</p>

<p><img src="/assets/img/posts/20200725/nasdaq_and_covid_new_cases.png" alt="Nasdaq and Covid new cases" /></p>

<p><img src="/assets/img/posts/20200725/nasdaq_and_covid_trend.png" alt="Nasdaq and Google Trends" /></p>

<p><img src="/assets/img/posts/20200725/nasdaq_and_vix.png" alt="Nasdaq and VIX" /></p>

<h2 id="결론">결론</h2>

<p><code class="language-plaintext highlighter-rouge">Nasdaq</code> 지수에 영향을 미치는 요인은 실제 일 신규 확진자 수보다 <code class="language-plaintext highlighter-rouge">VIX</code> 지표나 <code class="language-plaintext highlighter-rouge">covid</code> 구글 검색량이 더 큰 상관관계를 가지는 것을 확인할 수 있었다.</p>

<p>만일 제 2차 코로나 팬데믹이 발생한다면, <code class="language-plaintext highlighter-rouge">VIX</code> 지표 혹은 <code class="language-plaintext highlighter-rouge">covid</code> 검색량이 코로나 1차 팬데믹 당시의 수치보다 높거나 같아지는 시점이 아닐까 하는 예상을 해본다.</p>

<p>물론 이 예상이 반드시 맞는다고 볼 수는 없지만 적어도 이러한 공포 심리를 대표하는 데이터와의 관계를 통해 증시의 방향성을 가늠해볼 수 있지 않을까?</p>]]></content><author><name>Jonghyun Ho</name></author><category term="Data" /><category term="Analysis" /><category term="Crawling" /><category term="Python" /><category term="Covid" /><category term="Fear" /><category term="공포 심리" /><category term="Nasdaq" /><category term="나스닥" /><summary type="html"><![CDATA[현재 미국 내 코로나 바이러스 일 신규 확진자 수는 7만명을 넘어서고 있고 누적 확진자 수는 400만명을 넘고 있다.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jonghyunho.github.io/posts/20200725/fear-and-stock.png" /><media:content medium="image" url="https://jonghyunho.github.io/posts/20200725/fear-and-stock.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">유동성과 서울 부동산</title><link href="https://jonghyunho.github.io/data/analysis/liquidity_and_housing-price-index.html" rel="alternate" type="text/html" title="유동성과 서울 부동산" /><published>2020-07-11T00:00:00+09:00</published><updated>2020-07-11T00:00:00+09:00</updated><id>https://jonghyunho.github.io/data/analysis/liquidity_and_housing-price-index</id><content type="html" xml:base="https://jonghyunho.github.io/data/analysis/liquidity_and_housing-price-index.html"><![CDATA[<p>이번 포스팅에서는 현금 유동성과 서울 부동산의 관계를 살펴보고자 한다.</p>

<h2 id="m1-m2-lf">M1, M2, Lf</h2>

<ul>
  <li>
    <p>M1(협의통화)은 현금통화와 요구불예금 수시입출식예금(투신사 MMF 포함)의 합계</p>
  </li>
  <li>
    <p>M2(광의통화)는 M1과 만기 2년 미만 금융상품(예적금, 시장형 및 실적배당형, 금융채 등)의 합계</p>
  </li>
  <li>
    <p>Lf(금융기관 유동성, 종전 M3)는 M2와 2년 이상 유동성 상품, 생보사 보험계약준비금 등의 합계</p>
  </li>
</ul>

<p>이 중에서 <code class="language-plaintext highlighter-rouge">M1(협의통화)/M2(광의통화)</code>의 비율을 살펴보면 즉시 투입될 수 있는 현금의 비중을 살펴볼 수 있어 현금의 유동성을 대표할 수 있는 수치라 할 수 있다.</p>

<p>출처 : <a href="https://terms.naver.com/entry.nhn?docId=20082&amp;cid=43659&amp;categoryId=43659">M1, M2, Lf</a></p>

<h2 id="한국은행-openapi">한국은행 OpenAPI</h2>

<p><a href="http://ecos.bok.or.kr/jsp/openapi/OpenApiController.jsp">한국은행 경제통계시스템 OpenAPI</a>에 접속하여 회원 가입 후 <code class="language-plaintext highlighter-rouge">서비스 이용</code> &gt; <code class="language-plaintext highlighter-rouge">인증키 신청</code> 메뉴에서 개발에 필요한 인증키를 발급받는다.</p>

<p>인증키를 발급받은 후,<code class="language-plaintext highlighter-rouge">개발 가이드</code> &gt; <code class="language-plaintext highlighter-rouge">통계코드검색</code> 메뉴에서 API 호출에 필요한 코드를 파악할 수 있다.</p>

<p><img src="/assets/img/posts/20200711/statistics-code.png" alt="통계코드검색" /></p>

<p>위의 <code class="language-plaintext highlighter-rouge">통계코드검색</code> 페이지를 살펴보면</p>

<p><code class="language-plaintext highlighter-rouge">M1</code>의 통계자료 코드는 <code class="language-plaintext highlighter-rouge">010Y002</code>, 통계항목 코드는 <code class="language-plaintext highlighter-rouge">AAAA16</code> 이고,</p>

<p><code class="language-plaintext highlighter-rouge">M2</code>의 통계자료 코드는 <code class="language-plaintext highlighter-rouge">010Y002</code>, 통계항목 코드는 <code class="language-plaintext highlighter-rouge">AAAA18</code> 임을 알 수 있다.</p>

<h2 id="한국은행-데이터-크롤링">한국은행 데이터 크롤링</h2>

<p>아래 <code class="language-plaintext highlighter-rouge">getDataFromBankOfKorea</code>함수는 <code class="language-plaintext highlighter-rouge">code</code>와 <code class="language-plaintext highlighter-rouge">sub_code</code> 라는 변수의 이름으로 위의 코드를 입력으로 받아 한국은행 API를 호출하여 시계열 데이터를 얻을 수 있다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">json</span>
<span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>

<span class="n">endpoint_url</span> <span class="o">=</span> <span class="s">'http://ecos.bok.or.kr/api'</span>
<span class="n">dev_key</span> <span class="o">=</span> <span class="s">'XXXXXXXXXXXXXXXXXX'</span> 
<span class="n">start_date</span> <span class="o">=</span> <span class="s">'199501'</span>
<span class="n">end_date</span> <span class="o">=</span> <span class="s">'202007'</span>

<span class="k">def</span> <span class="nf">getDataFromBankOfKorea</span><span class="p">(</span><span class="n">name</span><span class="p">,</span> <span class="n">code</span><span class="p">,</span> <span class="n">sub_code</span><span class="p">):</span>
    <span class="n">url</span> <span class="o">=</span> <span class="n">endpoint_url</span> <span class="o">+</span> <span class="s">'/StatisticSearch/'</span> <span class="o">+</span> <span class="n">dev_key</span> <span class="o">+</span> <span class="s">'/json/kr/1/2000/'</span> <span class="o">+</span> <span class="n">code</span> <span class="o">+</span> <span class="s">'/MM/'</span> <span class="o">+</span> <span class="n">start_date</span> <span class="o">+</span> <span class="s">'/'</span> <span class="o">+</span> <span class="n">end_date</span> <span class="o">+</span> <span class="s">'/'</span> <span class="o">+</span> <span class="n">sub_code</span>

    <span class="n">req</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>

    <span class="n">results</span> <span class="o">=</span> <span class="n">json</span><span class="p">.</span><span class="n">loads</span><span class="p">(</span><span class="n">req</span><span class="p">.</span><span class="n">text</span><span class="p">)</span>
    <span class="n">results</span> <span class="o">=</span> <span class="n">results</span><span class="p">[</span><span class="s">'StatisticSearch'</span><span class="p">][</span><span class="s">'row'</span><span class="p">]</span>

    <span class="n">series</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">Series</span><span class="p">(</span><span class="n">dtype</span><span class="o">=</span><span class="nb">float</span><span class="p">)</span>

    <span class="k">for</span> <span class="n">result</span> <span class="ow">in</span> <span class="n">results</span><span class="p">:</span>
        <span class="n">time</span> <span class="o">=</span> <span class="n">result</span><span class="p">[</span><span class="s">'TIME'</span><span class="p">]</span>
        <span class="n">time</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">(</span><span class="n">year</span><span class="o">=</span><span class="nb">int</span><span class="p">(</span><span class="n">time</span><span class="p">[</span><span class="mi">0</span><span class="p">:</span><span class="mi">4</span><span class="p">]),</span> <span class="n">month</span><span class="o">=</span><span class="nb">int</span><span class="p">(</span><span class="n">time</span><span class="p">[</span><span class="mi">4</span><span class="p">:</span><span class="mi">6</span><span class="p">]),</span> <span class="n">day</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>

        <span class="n">value</span> <span class="o">=</span> <span class="nb">float</span><span class="p">(</span><span class="n">result</span><span class="p">[</span><span class="s">'DATA_VALUE'</span><span class="p">])</span>

        <span class="n">series</span><span class="p">[</span><span class="n">time</span><span class="p">]</span> <span class="o">=</span> <span class="n">value</span>

    <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">()</span>
    <span class="n">df</span><span class="p">[</span><span class="n">name</span><span class="p">]</span> <span class="o">=</span> <span class="n">series</span>
    <span class="k">return</span> <span class="n">df</span>
</code></pre></div></div>

<p>위 함수를 이용하여 <code class="language-plaintext highlighter-rouge">M1/M2</code> 유동성 비중을 계산하고, 서울 아파트 매매가격지수의 데이터를 합산하여 하나의 데이터 프레임을 생성한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">getM1M2</span><span class="p">():</span>
    <span class="n">df_M1</span> <span class="o">=</span> <span class="n">getDataFromBankOfKorea</span><span class="p">(</span><span class="s">'M1'</span><span class="p">,</span> <span class="s">'010Y002'</span><span class="p">,</span> <span class="s">'AAAA16'</span><span class="p">)</span>
    <span class="n">df_M2</span> <span class="o">=</span> <span class="n">getDataFromBankOfKorea</span><span class="p">(</span><span class="s">'M2'</span><span class="p">,</span> <span class="s">'010Y002'</span><span class="p">,</span> <span class="s">'AAAA18'</span><span class="p">)</span>

    <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">()</span>
    <span class="n">df</span><span class="p">[</span><span class="s">'M1'</span><span class="p">]</span> <span class="o">=</span> <span class="n">df_M1</span><span class="p">[</span><span class="s">'M1'</span><span class="p">]</span>
    <span class="n">df</span><span class="p">[</span><span class="s">'M2'</span><span class="p">]</span> <span class="o">=</span> <span class="n">df_M2</span><span class="p">[</span><span class="s">'M2'</span><span class="p">]</span>
    <span class="n">df</span><span class="p">[</span><span class="s">'M1/M2'</span><span class="p">]</span> <span class="o">=</span> <span class="n">df</span><span class="p">[</span><span class="s">'M1'</span><span class="p">]</span> <span class="o">/</span> <span class="n">df</span><span class="p">[</span><span class="s">'M2'</span><span class="p">]</span>
    <span class="k">return</span> <span class="n">df</span>

<span class="n">df</span> <span class="o">=</span> <span class="n">getM1M2</span><span class="p">()</span>

<span class="kn">from</span> <span class="nn">kbstar.get_house_price_index</span> <span class="kn">import</span> <span class="n">get_house_price_index</span>
<span class="n">house_price</span> <span class="o">=</span> <span class="n">get_house_price_index</span><span class="p">(</span><span class="s">'../../datasets/★(월간)KB주택가격동향_시계열(2020.06).xlsx'</span><span class="p">,</span> <span class="s">'매매종합'</span><span class="p">)</span>
<span class="n">house_price</span> <span class="o">=</span> <span class="n">house_price</span><span class="p">[</span><span class="s">'서울'</span><span class="p">][</span><span class="s">'서울'</span><span class="p">]</span>

<span class="n">df</span><span class="p">[</span><span class="s">'Seoul APT price index'</span><span class="p">]</span> <span class="o">=</span> <span class="n">house_price</span>
<span class="k">print</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">get_house_price_index</code> 함수는 <a href="https://jonghyunho.github.io/data/analysis/housing-purchase-price-composite-indices.html">주택매매가격 종합지수</a> 의 글에서 구현된 함수를 재사용하였다.</p>

<p>실행 결과는 다음과 같다.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                   M1         M2     M1/M2  Seoul APT price index
1986-01-01    14883.1    43133.6  0.345047              30.043817
1986-02-01    15038.9    43492.4  0.345782              30.043817
1986-03-01    15573.8    44587.1  0.349289              30.002377
1986-04-01    15891.8    45188.7  0.351676              29.836618
1986-05-01    16227.3    46197.5  0.351259              29.587979
...               ...        ...       ...                    ...
2019-12-01   927098.5  2912434.1  0.318324             102.559631
2020-01-01   945103.8  2929009.2  0.322670             103.055966
2020-02-01   957889.6  2954603.8  0.324202             103.416594
2020-03-01   988826.3  2984304.3  0.331342             103.905202
2020-04-01  1012290.1  3015816.3  0.335660             104.072285

<span class="o">[</span>412 rows x 4 columns]
</code></pre></div></div>

<h2 id="데이터-시각화">데이터 시각화</h2>

<p>위에서 생성된 데이터프레임을 이용하여 정규화 과정을 거쳐 그래프로 시각화한 결과이다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">[[</span><span class="s">'Seoul APT price index'</span><span class="p">,</span> <span class="s">'M1/M2'</span><span class="p">]]</span>
<span class="n">df</span> <span class="o">=</span> <span class="p">(</span><span class="n">df</span> <span class="o">-</span> <span class="n">df</span><span class="p">.</span><span class="n">mean</span><span class="p">())</span> <span class="o">/</span> <span class="n">df</span><span class="p">.</span><span class="n">std</span><span class="p">()</span>
<span class="n">df</span><span class="p">.</span><span class="n">plot</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">grid</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/img/posts/20200711/liquidity-housing-comparison-graph.png" alt="liquidity-housing-comparison-graph" /></p>

<h2 id="결론">결론</h2>

<p>그래프를 확인해보면, 1998년 IMF 위기와 2008년 리먼브러더스 위기 때, <code class="language-plaintext highlighter-rouge">M1/M2</code> 비중이 최저점을 찍은 후 부동산 매매가격지수도 일정 구간 상승하지 못하고 횡보하는 것을 확인할 수 있다.</p>

<p>반대로 2000년 초반, 2014년을 기점으로 <code class="language-plaintext highlighter-rouge">M1/M2</code> 비중이 급격히 증가하는데, 부동산 매매가격지수도 함께 지속적인 상승 곡선을 보이는 것을 확인할 수 있다.</p>

<p>유동성은 2019년에 다소 하락하지만 2020년에는 다시 증가하고 있어 집값 상승 가능성이 커지고 있다고 볼 수 있다.</p>]]></content><author><name>Jonghyun Ho</name></author><category term="Data" /><category term="Analysis" /><category term="Crawling" /><category term="Python" /><category term="부동산" /><category term="아파트" /><category term="매매가격지수" /><summary type="html"><![CDATA[이번 포스팅에서는 현금 유동성과 서울 부동산의 관계를 살펴보고자 한다.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jonghyunho.github.io/posts/20200711/liquidity-housing-comparison-graph.png" /><media:content medium="image" url="https://jonghyunho.github.io/posts/20200711/liquidity-housing-comparison-graph.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OpenAI gym Cartpole</title><link href="https://jonghyunho.github.io/reinforcement/learning/cartpole-reinforcement-learning.html" rel="alternate" type="text/html" title="OpenAI gym Cartpole" /><published>2020-05-05T00:00:00+09:00</published><updated>2020-05-05T00:00:00+09:00</updated><id>https://jonghyunho.github.io/reinforcement/learning/cartpole-reinforcement-learning</id><content type="html" xml:base="https://jonghyunho.github.io/reinforcement/learning/cartpole-reinforcement-learning.html"><![CDATA[<p>CartPole 이라는 환경에서 <code class="language-plaintext highlighter-rouge">강화 학습</code> 기법을 이용하여 주어진 목적을 달성해내는 과정을 시험해보고자 한다.</p>

<h2 id="강화학습">강화학습</h2>

<p><code class="language-plaintext highlighter-rouge">강화 학습(Reinforcement learning)</code>은 <code class="language-plaintext highlighter-rouge">기계 학습</code>의 한 영역이다. 어떠한 환경에서 소프트웨어 에이전트가 현재의 상태를 인식하여 특정 행동을 수행했을 때 환경으로부터 보상을 받을 수 있다. 이 누적된 보상의 값을 최대화하기 위해 최선의 행동들을 선택하여 목적을 달성할 수 있도록 하는 학습 방법이다.</p>

<p><img src="/assets/img/posts/20200505/agent-environment-interaction-in-mdp.png" alt="agent-environment-interaction-in-mdp" /></p>

<ul>
  <li>참고 : <a href="https://en.wikipedia.org/wiki/Reinforcement_learning">강화학습</a></li>
</ul>

<h2 id="cartpole">Cartpole</h2>

<p>OpenAI gym의 <a href="https://gym.openai.com/envs/CartPole-v1/">CartPole</a>은 카트 위에 막대기가 고정되어 있고 막대기는 중력에 의해 바닥을 향해 자연적으로 기울게 되는 환경을 제공한다. <code class="language-plaintext highlighter-rouge">CartPole</code> 의 목적은 카트를 좌, 우로 움직이며 막대기가 기울지 않고 서 있을 수 있도록 유지시켜 주는 것이 목적인데, <code class="language-plaintext highlighter-rouge">강화 학습</code> 알고리즘을 이용하여 막대기를 세울 수 있는 방법을 소프트웨어 에이전트가 스스로 학습할 수 있도록 한다.</p>

<p>아래는 <code class="language-plaintext highlighter-rouge">CartPole</code> 환경을 생성하여 episode 를 반복하는 예제이다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">gym</span>
<span class="n">env</span> <span class="o">=</span> <span class="n">gym</span><span class="p">.</span><span class="n">make</span><span class="p">(</span><span class="s">'CartPole-v1'</span><span class="p">)</span>

<span class="k">for</span> <span class="n">i_episode</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">20</span><span class="p">):</span>
    <span class="n">observation</span> <span class="o">=</span> <span class="n">env</span><span class="p">.</span><span class="n">reset</span><span class="p">()</span>
    <span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">100</span><span class="p">):</span>
        <span class="n">env</span><span class="p">.</span><span class="n">render</span><span class="p">()</span>
        <span class="k">print</span><span class="p">(</span><span class="n">observation</span><span class="p">)</span>
        <span class="n">action</span> <span class="o">=</span> <span class="n">env</span><span class="p">.</span><span class="n">action_space</span><span class="p">.</span><span class="n">sample</span><span class="p">()</span>
        <span class="n">observation</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">done</span><span class="p">,</span> <span class="n">info</span> <span class="o">=</span> <span class="n">env</span><span class="p">.</span><span class="n">step</span><span class="p">(</span><span class="n">action</span><span class="p">)</span>
        <span class="k">if</span> <span class="n">done</span><span class="p">:</span>
            <span class="k">print</span><span class="p">(</span><span class="s">"Episode finished after {} timesteps"</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">t</span><span class="o">+</span><span class="mi">1</span><span class="p">))</span>
            <span class="k">break</span>
<span class="n">env</span><span class="p">.</span><span class="n">close</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/img/posts/20200505/cartpole_random_movement.gif" alt="cartpole_random_movement" /></p>

<p><code class="language-plaintext highlighter-rouge">env.action_space.sample()</code>을 호출하면 좌, 우의 값이 0과 1로 랜덤하게 전달된다.</p>

<p><code class="language-plaintext highlighter-rouge">env.step(action)</code>을 통해 랜덤한 움직임에 대한 <code class="language-plaintext highlighter-rouge">action</code>을 한번 수행하고, <code class="language-plaintext highlighter-rouge">action</code>이 실행된 이후의 상태(<code class="language-plaintext highlighter-rouge">observation</code>)와, 보상(<code class="language-plaintext highlighter-rouge">reward</code>), 막대가 쓰러졌는지의 여부(<code class="language-plaintext highlighter-rouge">done</code>) 등의 정보가 반환된다.</p>

<p>다음은 <code class="language-plaintext highlighter-rouge">CartPole</code> 환경에서 사용되는 <code class="language-plaintext highlighter-rouge">observation</code>, <code class="language-plaintext highlighter-rouge">action</code>, <code class="language-plaintext highlighter-rouge">reward</code> 의 값과 episode의 <code class="language-plaintext highlighter-rouge">시작</code>과 <code class="language-plaintext highlighter-rouge">종료</code>에 대한 설명이다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Observation</span><span class="p">:</span>
	<span class="n">Type</span><span class="p">:</span> <span class="n">Box</span><span class="p">(</span><span class="mi">4</span><span class="p">)</span>
	<span class="n">Num</span>	<span class="n">Observation</span>               <span class="n">Min</span>             <span class="n">Max</span>
	<span class="mi">0</span>	<span class="n">Cart</span> <span class="n">Position</span>             <span class="o">-</span><span class="mf">4.8</span>            <span class="mf">4.8</span>
	<span class="mi">1</span>	<span class="n">Cart</span> <span class="n">Velocity</span>             <span class="o">-</span><span class="n">Inf</span>            <span class="n">Inf</span>
	<span class="mi">2</span>	<span class="n">Pole</span> <span class="n">Angle</span>                <span class="o">-</span><span class="mi">24</span> <span class="n">deg</span>         <span class="mi">24</span> <span class="n">deg</span>
	<span class="mi">3</span>	<span class="n">Pole</span> <span class="n">Velocity</span> <span class="n">At</span> <span class="n">Tip</span>      <span class="o">-</span><span class="n">Inf</span>            <span class="n">Inf</span>
	
<span class="n">Actions</span><span class="p">:</span>
	<span class="n">Type</span><span class="p">:</span> <span class="n">Discrete</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>
	<span class="n">Num</span>	<span class="n">Action</span>
	<span class="mi">0</span>	<span class="n">Push</span> <span class="n">cart</span> <span class="n">to</span> <span class="n">the</span> <span class="n">left</span>
	<span class="mi">1</span>	<span class="n">Push</span> <span class="n">cart</span> <span class="n">to</span> <span class="n">the</span> <span class="n">right</span>

<span class="n">Reward</span><span class="p">:</span>
	<span class="n">Reward</span> <span class="ow">is</span> <span class="mi">1</span> <span class="k">for</span> <span class="n">every</span> <span class="n">step</span> <span class="n">taken</span><span class="p">,</span> <span class="n">including</span> <span class="n">the</span> <span class="n">termination</span> <span class="n">step</span>
	
<span class="n">Starting</span> <span class="n">State</span><span class="p">:</span>
	<span class="n">All</span> <span class="n">observations</span> <span class="n">are</span> <span class="n">assigned</span> <span class="n">a</span> <span class="n">uniform</span> <span class="n">random</span> <span class="n">value</span> <span class="ow">in</span> <span class="p">[</span><span class="o">-</span><span class="mf">0.05</span><span class="p">..</span><span class="mf">0.05</span><span class="p">]</span>

<span class="n">Episode</span> <span class="n">Termination</span><span class="p">:</span>
	<span class="n">Pole</span> <span class="n">Angle</span> <span class="ow">is</span> <span class="n">more</span> <span class="n">than</span> <span class="mi">12</span> <span class="n">degrees</span><span class="p">.</span>
	<span class="n">Cart</span> <span class="n">Position</span> <span class="ow">is</span> <span class="n">more</span> <span class="n">than</span> <span class="mf">2.4</span> <span class="p">(</span><span class="n">center</span> <span class="n">of</span> <span class="n">the</span> <span class="n">cart</span> <span class="n">reaches</span> <span class="n">the</span> <span class="n">edge</span> <span class="n">of</span>
	<span class="n">the</span> <span class="n">display</span><span class="p">).</span>
	<span class="n">Episode</span> <span class="n">length</span> <span class="ow">is</span> <span class="n">greater</span> <span class="n">than</span> <span class="mf">200.</span>
	<span class="n">Solved</span> <span class="n">Requirements</span><span class="p">:</span>
	<span class="n">Considered</span> <span class="n">solved</span> <span class="n">when</span> <span class="n">the</span> <span class="n">average</span> <span class="n">reward</span> <span class="ow">is</span> <span class="n">greater</span> <span class="n">than</span> <span class="ow">or</span> <span class="n">equal</span> <span class="n">to</span>
	<span class="mf">195.0</span> <span class="n">over</span> <span class="mi">100</span> <span class="n">consecutive</span> <span class="n">trials</span><span class="p">.</span>
</code></pre></div></div>

<h3 id="dqn-알고리즘">DQN 알고리즘</h3>

<p><code class="language-plaintext highlighter-rouge">DQN 알고리즘</code>의 pseudo code 는 다음과 같다.</p>

<p><img src="/assets/img/posts/20200505/algorithm-deep-q-learning-with-experience-replay.png" alt="algorithm-deep-q-learning-with-experience-replay" /></p>

<h2 id="cartpole-dqn-강화-학습">Cartpole DQN 강화 학습</h2>

<p>학습을 위한 <code class="language-plaintext highlighter-rouge">tensorflow</code>를 포함하여 필요한 모듈을 임포트 한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">gym</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">random</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">from</span> <span class="nn">collections</span> <span class="kn">import</span> <span class="n">deque</span>
<span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="n">tf</span>
<span class="kn">from</span> <span class="nn">tensorflow.keras.models</span> <span class="kn">import</span> <span class="n">Sequential</span>
<span class="kn">from</span> <span class="nn">tensorflow.keras.layers</span> <span class="kn">import</span> <span class="n">Dense</span>
<span class="kn">from</span> <span class="nn">tensorflow.keras.initializers</span> <span class="kn">import</span> <span class="n">RandomUniform</span>
<span class="kn">from</span> <span class="nn">tensorflow.keras.optimizers</span> <span class="kn">import</span> <span class="n">Adam</span>
</code></pre></div></div>

<p>상태가 입력, Q 함수가 출력인 인공신경망 생성</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">DQN</span><span class="p">(</span><span class="n">tf</span><span class="p">.</span><span class="n">keras</span><span class="p">.</span><span class="n">Model</span><span class="p">):</span>
    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">action_size</span><span class="p">):</span>
        <span class="nb">super</span><span class="p">(</span><span class="n">DQN</span><span class="p">,</span> <span class="bp">self</span><span class="p">).</span><span class="n">__init__</span><span class="p">()</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">fc1</span> <span class="o">=</span> <span class="n">Dense</span><span class="p">(</span><span class="mi">24</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'relu'</span><span class="p">)</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">fc2</span> <span class="o">=</span> <span class="n">Dense</span><span class="p">(</span><span class="mi">24</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s">'relu'</span><span class="p">)</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">fc_out</span> <span class="o">=</span> <span class="n">Dense</span><span class="p">(</span><span class="n">action_size</span><span class="p">,</span> <span class="n">kernel_initializer</span><span class="o">=</span><span class="n">RandomUniform</span><span class="p">(</span><span class="o">-</span><span class="mf">1e-3</span><span class="p">,</span> <span class="mf">1e-3</span><span class="p">))</span>

    <span class="k">def</span> <span class="nf">call</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
        <span class="n">x</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">fc1</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
        <span class="n">x</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">fc2</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
        <span class="n">q</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">fc_out</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
        <span class="k">return</span> <span class="n">q</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">CartPole</code> 환경에서 agent 역할을 하는 <code class="language-plaintext highlighter-rouge">DQNAgent</code> 클래스이다. <code class="language-plaintext highlighter-rouge">CartPole</code> 환경에서는 4가지의 상태와 2가지의 행동으로 이루어진다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">DQNAgent</span><span class="p">:</span>
    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">state_size</span><span class="p">,</span> <span class="n">action_size</span><span class="p">):</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">state_size</span> <span class="o">=</span> <span class="n">state_size</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">action_size</span> <span class="o">=</span> <span class="n">action_size</span>
</code></pre></div></div>

<p>DQN 알고리즘을 구동하기 위한 하이퍼파라미터 값을 설정한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        <span class="bp">self</span><span class="p">.</span><span class="n">discount_factor</span> <span class="o">=</span> <span class="mf">0.99</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">learning_rate</span> <span class="o">=</span> <span class="mf">0.001</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">epsilon</span> <span class="o">=</span> <span class="mf">1.0</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">epsilon_decay</span> <span class="o">=</span> <span class="mf">0.999</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">epsilon_min</span> <span class="o">=</span> <span class="mf">0.01</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">batch_size</span> <span class="o">=</span> <span class="mi">64</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">train_start</span> <span class="o">=</span> <span class="mi">1000</span>
</code></pre></div></div>

<p>리플레이 메모리는 최대 크기 2000 으로 설정하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        <span class="bp">self</span><span class="p">.</span><span class="n">memory</span> <span class="o">=</span> <span class="n">deque</span><span class="p">(</span><span class="n">maxlen</span><span class="o">=</span><span class="mi">2000</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">model</code> 과 <code class="language-plaintext highlighter-rouge">target_model</code> 두 개의 인공신경망을 생성한다. Q 함수를 학습하기 위해 <code class="language-plaintext highlighter-rouge">model</code>의 파라미터가 학습 도중 갱신되는데, 이 파라미터의 변경으로 인하여 정답으로 간주되는 다음 상태의 Q 함수도 함께 변경이 된다. 이를 막기 위해 다음 상태의 Q 함수를 위한 별도의 <code class="language-plaintext highlighter-rouge">target_model</code>을 분리하여 사용한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        <span class="bp">self</span><span class="p">.</span><span class="n">model</span> <span class="o">=</span> <span class="n">DQN</span><span class="p">(</span><span class="n">action_size</span><span class="p">)</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">target_model</span> <span class="o">=</span> <span class="n">DQN</span><span class="p">(</span><span class="n">action_size</span><span class="p">)</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">optimizer</span> <span class="o">=</span> <span class="n">Adam</span><span class="p">(</span><span class="n">lr</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">learning_rate</span><span class="p">)</span>
		
        <span class="bp">self</span><span class="p">.</span><span class="n">update_target_model</span><span class="p">()</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">update_target_model</code>은 <code class="language-plaintext highlighter-rouge">target_model</code>의 가중치를 <code class="language-plaintext highlighter-rouge">model</code>의 가중치로 업데이트 하는 함수이다. 일정 주기로 타겟 흔들림을 해결하기 위해 분리된 <code class="language-plaintext highlighter-rouge">model</code> 과 <code class="language-plaintext highlighter-rouge">target_model</code> 네트워크의 가중치를 일치시킨다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">def</span> <span class="nf">update_target_model</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">target_model</span><span class="p">.</span><span class="n">set_weights</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">model</span><span class="p">.</span><span class="n">get_weights</span><span class="p">())</span>
</code></pre></div></div>

<p>리플레이 메모리에 현재 상태 S, 액션 A, 보상 R, 다음 상태 S’, 완료 여부 done 을 저장한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">def</span> <span class="nf">remember</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">state</span><span class="p">,</span> <span class="n">action</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">next_state</span><span class="p">,</span> <span class="n">done</span><span class="p">):</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">append</span><span class="p">((</span><span class="n">state</span><span class="p">,</span> <span class="n">action</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">next_state</span><span class="p">,</span> <span class="n">done</span><span class="p">))</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">epsilon</code>을 이용하여 <code class="language-plaintext highlighter-rouge">탐험(Exploration)</code>과 <code class="language-plaintext highlighter-rouge">활용(Exploitation)</code>의 비율을 조정한다.</p>

<p>학습된 정보만을 이용하여 action 을 선택하게 되면 새로운 환경에 대해 경험해 볼 수 없기 때문에 랜덤한 수를 골라 e 보다 작으면 랜덤, 그렇지 않으면 학습된 모델을 사용하는 E-greedy 정책을 사용한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">def</span> <span class="nf">choose_action</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">state</span><span class="p">):</span>
        <span class="k">return</span> <span class="n">random</span><span class="p">.</span><span class="n">randrange</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">action_size</span><span class="p">)</span> <span class="k">if</span> <span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">rand</span><span class="p">()</span> <span class="o">&lt;=</span> <span class="bp">self</span><span class="p">.</span><span class="n">epsilon</span><span class="p">)</span> <span class="k">else</span> <span class="n">np</span><span class="p">.</span><span class="n">argmax</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">model</span><span class="p">.</span><span class="n">predict</span><span class="p">(</span><span class="n">state</span><span class="p">))</span>
</code></pre></div></div>

<p>샘플 간 correlation 을 줄이기 위해 리플레이 메모리에 저장된 데이터를 랜덤하게 섞어 훈련에 사용할 미니 배치 데이터를 생성한다.</p>

<p>벨만 최적 방정식을 이용하여 계산된 정답에 해당하는 <code class="language-plaintext highlighter-rouge">targets</code> 와 예상 값 <code class="language-plaintext highlighter-rouge">predicts</code> 의 차이를 줄여 나가는 <code class="language-plaintext highlighter-rouge">경사 하강법</code>으로 학습을 진행한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">def</span> <span class="nf">train_model</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="k">if</span> <span class="bp">self</span><span class="p">.</span><span class="n">epsilon</span> <span class="o">&gt;</span> <span class="bp">self</span><span class="p">.</span><span class="n">epsilon_min</span><span class="p">:</span>
            <span class="bp">self</span><span class="p">.</span><span class="n">epsilon</span> <span class="o">*=</span> <span class="bp">self</span><span class="p">.</span><span class="n">epsilon_decay</span>

        <span class="n">mini_batch</span> <span class="o">=</span> <span class="n">random</span><span class="p">.</span><span class="n">sample</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">memory</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">batch_size</span><span class="p">)</span>

        <span class="n">states</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">sample</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">sample</span> <span class="ow">in</span> <span class="n">mini_batch</span><span class="p">])</span>
        <span class="n">actions</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">sample</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">for</span> <span class="n">sample</span> <span class="ow">in</span> <span class="n">mini_batch</span><span class="p">])</span>
        <span class="n">rewards</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">sample</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">for</span> <span class="n">sample</span> <span class="ow">in</span> <span class="n">mini_batch</span><span class="p">])</span>
        <span class="n">next_states</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">sample</span><span class="p">[</span><span class="mi">3</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">sample</span> <span class="ow">in</span> <span class="n">mini_batch</span><span class="p">])</span>
        <span class="n">dones</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">sample</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="k">for</span> <span class="n">sample</span> <span class="ow">in</span> <span class="n">mini_batch</span><span class="p">])</span>

        <span class="n">model_params</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">model</span><span class="p">.</span><span class="n">trainable_variables</span>
        <span class="k">with</span> <span class="n">tf</span><span class="p">.</span><span class="n">GradientTape</span><span class="p">()</span> <span class="k">as</span> <span class="n">tape</span><span class="p">:</span>
            <span class="n">predicts</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">model</span><span class="p">(</span><span class="n">states</span><span class="p">)</span>
            <span class="n">one_hot_action</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">one_hot</span><span class="p">(</span><span class="n">actions</span><span class="p">,</span> <span class="bp">self</span><span class="p">.</span><span class="n">action_size</span><span class="p">)</span>
            <span class="n">predicts</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">reduce_sum</span><span class="p">(</span><span class="n">one_hot_action</span> <span class="o">*</span> <span class="n">predicts</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>

            <span class="n">target_predicts</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">target_model</span><span class="p">(</span><span class="n">next_states</span><span class="p">)</span>
            <span class="n">target_predicts</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">stop_gradient</span><span class="p">(</span><span class="n">target_predicts</span><span class="p">)</span>

            <span class="n">max_q</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">amax</span><span class="p">(</span><span class="n">target_predicts</span><span class="p">,</span> <span class="n">axis</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span>
            <span class="n">targets</span> <span class="o">=</span> <span class="n">rewards</span> <span class="o">+</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">dones</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="p">.</span><span class="n">discount_factor</span> <span class="o">*</span> <span class="n">max_q</span>
            <span class="n">loss</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="p">.</span><span class="n">square</span><span class="p">(</span><span class="n">targets</span> <span class="o">-</span> <span class="n">predicts</span><span class="p">))</span>

        <span class="n">grads</span> <span class="o">=</span> <span class="n">tape</span><span class="p">.</span><span class="n">gradient</span><span class="p">(</span><span class="n">loss</span><span class="p">,</span> <span class="n">model_params</span><span class="p">)</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">optimizer</span><span class="p">.</span><span class="n">apply_gradients</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">grads</span><span class="p">,</span> <span class="n">model_params</span><span class="p">))</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">CartPole-v1</code> 환경과 그 환경에서 학습을 진행하게 될 <code class="language-plaintext highlighter-rouge">DQNAgent</code>를 생성한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
    <span class="n">env</span> <span class="o">=</span> <span class="n">gym</span><span class="p">.</span><span class="n">make</span><span class="p">(</span><span class="s">'CartPole-v1'</span><span class="p">)</span>
    <span class="n">state_size</span> <span class="o">=</span> <span class="n">env</span><span class="p">.</span><span class="n">observation_space</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
    <span class="n">action_size</span> <span class="o">=</span> <span class="n">env</span><span class="p">.</span><span class="n">action_space</span><span class="p">.</span><span class="n">n</span>

    <span class="n">agent</span> <span class="o">=</span> <span class="n">DQNAgent</span><span class="p">(</span><span class="n">state_size</span><span class="p">,</span> <span class="n">action_size</span><span class="p">)</span>

    <span class="n">scores</span><span class="p">,</span> <span class="n">episodes</span> <span class="o">=</span> <span class="p">[],</span> <span class="p">[]</span>
    <span class="n">score_avg</span> <span class="o">=</span> <span class="mi">0</span>
</code></pre></div></div>

<p>episode 가 시작될 때마다 환경을 초기화한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">num_episode</span> <span class="o">=</span> <span class="mi">300</span>
    <span class="k">for</span> <span class="n">e</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_episode</span><span class="p">):</span>
        <span class="n">done</span> <span class="o">=</span> <span class="bp">False</span>
        <span class="n">score</span> <span class="o">=</span> <span class="mi">0</span>

        <span class="n">state</span> <span class="o">=</span> <span class="n">env</span><span class="p">.</span><span class="n">reset</span><span class="p">()</span>
        <span class="n">state</span> <span class="o">=</span> <span class="n">state</span><span class="p">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
</code></pre></div></div>

<p>현재 상태에서 action 을 하나 선택하여 한 스텝 진행한다.</p>

<p>그 결과로 받은 보상을 현재 상태와 선택한 행동과 함께 리플레이 메모리에 저장한다.</p>

<p>리플레이 메모리가 일정 크기 이상으로 저장되면 매 스텝마다 학습할 수 있도록 한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        <span class="k">while</span> <span class="ow">not</span> <span class="n">done</span><span class="p">:</span>
            <span class="n">env</span><span class="p">.</span><span class="n">render</span><span class="p">()</span>

            <span class="n">action</span> <span class="o">=</span> <span class="n">agent</span><span class="p">.</span><span class="n">choose_action</span><span class="p">(</span><span class="n">state</span><span class="p">)</span>

            <span class="n">next_state</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">done</span><span class="p">,</span> <span class="n">info</span> <span class="o">=</span> <span class="n">env</span><span class="p">.</span><span class="n">step</span><span class="p">(</span><span class="n">action</span><span class="p">)</span>
            <span class="n">next_state</span> <span class="o">=</span> <span class="n">next_state</span><span class="p">.</span><span class="n">reshape</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>

            <span class="n">score</span> <span class="o">+=</span> <span class="n">reward</span>
            <span class="n">reward</span> <span class="o">=</span> <span class="mf">0.1</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">done</span> <span class="ow">or</span> <span class="n">score</span> <span class="o">==</span> <span class="mi">500</span> <span class="k">else</span> <span class="o">-</span><span class="mi">1</span>

            <span class="n">agent</span><span class="p">.</span><span class="n">remember</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">action</span><span class="p">,</span> <span class="n">reward</span><span class="p">,</span> <span class="n">next_state</span><span class="p">,</span> <span class="n">done</span><span class="p">)</span>

            <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">agent</span><span class="p">.</span><span class="n">memory</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="n">agent</span><span class="p">.</span><span class="n">train_start</span><span class="p">:</span>
                <span class="n">agent</span><span class="p">.</span><span class="n">train_model</span><span class="p">()</span>

            <span class="n">state</span> <span class="o">=</span> <span class="n">next_state</span>
</code></pre></div></div>

<p>한 episode 가 완료될 때마다 <code class="language-plaintext highlighter-rouge">target_model</code> 을 <code class="language-plaintext highlighter-rouge">model</code>의 가중치와 일치하도록 동기화하고, score 와 모델의 가중치를 저장한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>            <span class="k">if</span> <span class="n">done</span><span class="p">:</span>
                <span class="n">agent</span><span class="p">.</span><span class="n">update_target_model</span><span class="p">()</span>

                <span class="n">score_avg</span> <span class="o">=</span> <span class="mf">0.9</span> <span class="o">*</span> <span class="n">score_avg</span> <span class="o">+</span> <span class="mf">0.1</span> <span class="o">*</span> <span class="n">score</span> <span class="k">if</span> <span class="n">score_avg</span> <span class="o">!=</span> <span class="mi">0</span> <span class="k">else</span> <span class="n">score</span>
                <span class="k">print</span><span class="p">(</span><span class="s">'episode: {:3d} | score avg {:3.2f} | memory length: {:4d} | epsilon: {:.4f}'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">e</span><span class="p">,</span> <span class="n">score_avg</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">agent</span><span class="p">.</span><span class="n">memory</span><span class="p">),</span> <span class="n">agent</span><span class="p">.</span><span class="n">epsilon</span><span class="p">))</span>

                <span class="n">scores</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">score_avg</span><span class="p">)</span>
                <span class="n">episodes</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">e</span><span class="p">)</span>
                <span class="n">plt</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">episodes</span><span class="p">,</span> <span class="n">scores</span><span class="p">,</span> <span class="s">'b'</span><span class="p">)</span>
                <span class="n">plt</span><span class="p">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s">'episode'</span><span class="p">)</span>
                <span class="n">plt</span><span class="p">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s">'average score'</span><span class="p">)</span>
                <span class="n">plt</span><span class="p">.</span><span class="n">savefig</span><span class="p">(</span><span class="s">'cartpole_graph.png'</span><span class="p">)</span>

                <span class="k">if</span> <span class="n">score_avg</span>  <span class="o">&gt;</span> <span class="mi">400</span><span class="p">:</span>
                    <span class="n">agent</span><span class="p">.</span><span class="n">model</span><span class="p">.</span><span class="n">save_weights</span><span class="p">(</span><span class="s">'./save_model/model'</span><span class="p">,</span> <span class="n">save_format</span><span class="o">=</span><span class="s">'tf'</span><span class="p">)</span>
                    <span class="n">sys</span><span class="p">.</span><span class="nb">exit</span><span class="p">()</span>

</code></pre></div></div>

<p>위의 코드는 다음과 같이 실행되어 episode 마다 현재의 상태를 출력한다. episode 가 진행될수록 평균 점수가 높아지는 학습 효과를 확인할 수 있다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">episode</span><span class="p">:</span>   <span class="mi">0</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">59.00</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span>   <span class="mi">59</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">1.0000</span>
<span class="n">episode</span><span class="p">:</span>   <span class="mi">1</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">54.60</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span>   <span class="mi">74</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">1.0000</span>
<span class="n">episode</span><span class="p">:</span>   <span class="mi">2</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">51.44</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span>   <span class="mi">97</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">1.0000</span>
<span class="n">episode</span><span class="p">:</span>   <span class="mi">3</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">47.20</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span>  <span class="mi">106</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">1.0000</span>
<span class="n">episode</span><span class="p">:</span>   <span class="mi">4</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">45.98</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span>  <span class="mi">141</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">1.0000</span>
<span class="n">episode</span><span class="p">:</span>   <span class="mi">5</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">42.88</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span>  <span class="mi">156</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">1.0000</span>
<span class="n">episode</span><span class="p">:</span>   <span class="mi">6</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">40.59</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span>  <span class="mi">176</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">1.0000</span>
<span class="n">episode</span><span class="p">:</span>   <span class="mi">7</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">38.83</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span>  <span class="mi">199</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">1.0000</span>
<span class="n">episode</span><span class="p">:</span>   <span class="mi">8</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">36.05</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span>  <span class="mi">210</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">1.0000</span>
<span class="p">...</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">177</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">284.15</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">178</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">292.34</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">179</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">294.70</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">180</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">299.43</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">181</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">319.49</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">182</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">326.04</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">183</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">337.44</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">184</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">332.79</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">185</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">349.51</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">186</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">364.56</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">187</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">378.11</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">188</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">390.30</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="n">episode</span><span class="p">:</span> <span class="mi">189</span> <span class="o">|</span> <span class="n">score</span> <span class="n">avg</span> <span class="mf">401.27</span> <span class="o">|</span> <span class="n">memory</span> <span class="n">length</span><span class="p">:</span> <span class="mi">2000</span> <span class="o">|</span> <span class="n">epsilon</span><span class="p">:</span> <span class="mf">0.0100</span>
<span class="p">...</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">cartpole_graph.png</code> 는 학습이 진행되면서 저장된 <code class="language-plaintext highlighter-rouge">score</code>의 변화에 대한 그래프를 보여주고 있다.</p>

<p><img src="/assets/img/posts/20200505/cartpole_train_graph.png" alt="cartpole_train_graph" /></p>

<p>실행한 결과는 다음과 같이 초반에 불안정하게 막대를 세우는 모습을 보이고 있다.</p>

<p><img src="/assets/img/posts/20200505/cartpole_random_movement.gif" alt="cartpole_random_movement" /></p>

<p>episode 100회 반복을 넘으면서 안정적으로 막대기를 세우는 모습을 확인할 수 있다.</p>

<p><img src="/assets/img/posts/20200505/cartpole_episode_100.gif" alt="cartpole_episode_100" /></p>

<h2 id="reference">Reference</h2>

<p><a href="https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py">OpenAI gym - CartPole github</a></p>

<p><a href="https://github.com/rlcode/reinforcement-learning-kr-v2/blob/master/2-cartpole/1-dqn/train.py">Cartpole DQN github</a></p>

<p><a href="http://hunkim.github.io/ml/">모두를 위한 머신러닝과 딥러닝의 강의 - Deep Reinforcement Learning</a></p>

<p><a href="https://arxiv.org/pdf/1312.5602.pdf">Playing Atari with Deep Reinforcement Learning</a></p>]]></content><author><name>Jonghyun Ho</name></author><category term="Reinforcement" /><category term="Learning" /><category term="OpenAI" /><category term="gym" /><category term="Cartpole" /><category term="Python" /><category term="Reinforcement Learning" /><category term="강화학습" /><summary type="html"><![CDATA[CartPole 이라는 환경에서 강화 학습 기법을 이용하여 주어진 목적을 달성해내는 과정을 시험해보고자 한다.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jonghyunho.github.io/posts/20200505/cartpole_episode_100.gif" /><media:content medium="image" url="https://jonghyunho.github.io/posts/20200505/cartpole_episode_100.gif" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">서울 부동산과 KOSPI 지수의 관계</title><link href="https://jonghyunho.github.io/data/analysis/comparison-real-estate-and-stock-indices.html" rel="alternate" type="text/html" title="서울 부동산과 KOSPI 지수의 관계" /><published>2020-05-02T00:00:00+09:00</published><updated>2020-05-02T00:00:00+09:00</updated><id>https://jonghyunho.github.io/data/analysis/comparison-real-estate-and-stock-indices</id><content type="html" xml:base="https://jonghyunho.github.io/data/analysis/comparison-real-estate-and-stock-indices.html"><![CDATA[<p>보통 부동산의 가격이 오르면 주식의 가격이 내리고, 부동산의 가격이 내리면 주식의 가격이 오른다는 이야기를 듣곤 한다.</p>

<p>부동산과 주식, 두 지수에는 어떠한 관계가 있는지 살펴보고자 한다. 세부적인 비교를 위해 <code class="language-plaintext highlighter-rouge">서울 부동산</code>과 <code class="language-plaintext highlighter-rouge">KOSPI 지수</code>를 비교 대상으로 하였다.</p>

<h2 id="python-을-이용한-분석">Python 을 이용한 분석</h2>

<p>필요한 모듈을 임포트 한다.</p>

<p><code class="language-plaintext highlighter-rouge">get_house_price_index</code> 함수는 <a href="https://jonghyunho.github.io/data/analysis/housing-purchase-price-composite-indices.html">주택매매가격 종합지수</a> 의 글에서 구현된 형태로 재사용한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">datetime</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>

<span class="kn">import</span> <span class="nn">yfinance</span> <span class="k">as</span> <span class="n">yf</span>
<span class="kn">from</span> <span class="nn">kbstar.get_house_price_index</span> <span class="kn">import</span> <span class="n">get_house_price_index</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">get_stock_prices</code> 는 <a href="https://jonghyunho.github.io/data/crawling/how-to-get-stock-data-using-yahoo-finance-python-api.html">How to get stock data using Yahoo Finance Python API
</a> 에서 사용한 방법과 동일하지만 일간 단위의 데이터를 월별 데이터로 변환하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">get_stock_prices</span><span class="p">(</span><span class="n">code</span><span class="p">,</span> <span class="n">column_name</span><span class="p">,</span> <span class="n">period</span><span class="o">=</span><span class="s">'5y'</span><span class="p">,</span> <span class="n">is_monthly</span><span class="o">=</span><span class="bp">True</span><span class="p">):</span>
    <span class="n">stock</span> <span class="o">=</span> <span class="n">yf</span><span class="p">.</span><span class="n">Ticker</span><span class="p">(</span><span class="n">code</span><span class="p">)</span>
    <span class="n">stock</span> <span class="o">=</span> <span class="n">stock</span><span class="p">.</span><span class="n">history</span><span class="p">(</span><span class="n">period</span><span class="o">=</span><span class="n">period</span><span class="p">)</span>
    <span class="n">stock</span> <span class="o">=</span> <span class="n">stock</span><span class="p">[[</span><span class="s">'Close'</span><span class="p">]]</span>
    <span class="n">stock</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="p">{</span><span class="s">'Close'</span><span class="p">:</span> <span class="n">column_name</span><span class="p">},</span> <span class="n">inplace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>

    <span class="k">if</span> <span class="n">is_monthly</span><span class="p">:</span>
        <span class="n">monthly</span> <span class="o">=</span> <span class="p">[]</span>
        <span class="k">for</span> <span class="n">index</span> <span class="ow">in</span> <span class="n">stock</span><span class="p">.</span><span class="n">index</span><span class="p">:</span>
            <span class="n">monthly</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">datetime</span><span class="p">.</span><span class="n">datetime</span><span class="p">(</span><span class="n">index</span><span class="p">.</span><span class="n">year</span><span class="p">,</span> <span class="n">index</span><span class="p">.</span><span class="n">month</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>

        <span class="n">stock</span><span class="p">[</span><span class="s">'Monthly'</span><span class="p">]</span> <span class="o">=</span> <span class="n">monthly</span>
        <span class="n">stock</span> <span class="o">=</span> <span class="n">stock</span><span class="p">.</span><span class="n">groupby</span><span class="p">(</span><span class="s">'Monthly'</span><span class="p">).</span><span class="n">mean</span><span class="p">()</span>

    <span class="k">return</span> <span class="n">stock</span>
</code></pre></div></div>

<p>월별로 정리된 데이터는 다음과 같다.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">kospi</span> <span class="o">=</span> <span class="n">get_stock_prices</span><span class="p">(</span><span class="s">'^KS11'</span><span class="p">,</span> <span class="s">'kospi'</span><span class="p">,</span> <span class="s">'max'</span><span class="p">)</span>  <span class="c1"># KOSPI Composite Index
</span><span class="k">print</span><span class="p">(</span><span class="n">kospi</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                  <span class="n">kospi</span>
<span class="n">Monthly</span>                
<span class="mi">1997</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">01</span>   <span class="mf">751.725455</span>
<span class="mi">1997</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">01</span>   <span class="mf">741.660500</span>
<span class="mi">1997</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">01</span>   <span class="mf">676.228947</span>
<span class="mi">1997</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">01</span>   <span class="mf">581.165909</span>
<span class="mi">1997</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">01</span>   <span class="mf">497.304000</span>
<span class="p">...</span>                 <span class="p">...</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">2147.013500</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">2203.442500</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">2167.123500</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">1786.746364</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">1849.589000</span>

<span class="p">[</span><span class="mi">274</span> <span class="n">rows</span> <span class="n">x</span> <span class="mi">1</span> <span class="n">columns</span><span class="p">]</span>
</code></pre></div></div>

<p>datasets 폴더에 시계열 데이터가 저장된 엑셀 파일에서 <code class="language-plaintext highlighter-rouge">매매종합</code> 시트의 값을 읽는다. <code class="language-plaintext highlighter-rouge">KOSPI 지수</code>와 비교하기 위해 <code class="language-plaintext highlighter-rouge">부동산 지수</code>는 <code class="language-plaintext highlighter-rouge">서울</code>의 데이터를 사용하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">house_price</span> <span class="o">=</span> <span class="n">get_house_price_index</span><span class="p">(</span><span class="s">'datasets/★(월간)KB주택가격동향_시계열(2020.04).xlsx'</span><span class="p">,</span> <span class="s">'매매종합'</span><span class="p">)</span>
<span class="n">house_price</span> <span class="o">=</span> <span class="n">house_price</span><span class="p">[</span><span class="s">'서울'</span><span class="p">][</span><span class="s">'서울'</span><span class="p">]</span>
<span class="k">print</span><span class="p">(</span><span class="n">house_price</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="mi">1986</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">01</span>     <span class="mf">30.043817</span>
<span class="mi">1986</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">01</span>     <span class="mf">30.043817</span>
<span class="mi">1986</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">01</span>     <span class="mf">30.002377</span>
<span class="mi">1986</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">01</span>     <span class="mf">29.836618</span>
<span class="mi">1986</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">01</span>     <span class="mf">29.587979</span>
                 <span class="p">...</span>    
<span class="mi">2019</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">01</span>    <span class="mf">102.559631</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">01</span>    <span class="mf">103.055966</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">01</span>    <span class="mf">103.416594</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">01</span>    <span class="mf">103.905202</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">01</span>    <span class="mf">104.072285</span>
<span class="n">Name</span><span class="p">:</span> <span class="n">서울</span><span class="p">,</span> <span class="n">Length</span><span class="p">:</span> <span class="mi">412</span><span class="p">,</span> <span class="n">dtype</span><span class="p">:</span> <span class="n">float64</span>
</code></pre></div></div>

<p>두 지수의 데이터를 하나의 <code class="language-plaintext highlighter-rouge">DataFrame</code> 에 합치고, 비교를 위해 정규화를 하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">()</span>
<span class="n">df</span><span class="p">[</span><span class="s">'housing'</span><span class="p">]</span> <span class="o">=</span> <span class="n">house_price</span>
<span class="n">df</span><span class="p">[</span><span class="s">'kospi'</span><span class="p">]</span> <span class="o">=</span> <span class="n">kospi</span><span class="p">.</span><span class="n">kospi</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">dropna</span><span class="p">()</span>
<span class="n">df</span> <span class="o">=</span> <span class="p">(</span><span class="n">df</span> <span class="o">-</span> <span class="n">df</span><span class="p">.</span><span class="n">mean</span><span class="p">())</span> <span class="o">/</span> <span class="n">df</span><span class="p">.</span><span class="n">std</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">df</span><span class="p">)</span>
</code></pre></div></div>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>             <span class="n">housing</span>     <span class="n">kospi</span>
<span class="mi">1997</span><span class="o">-</span><span class="mi">07</span><span class="o">-</span><span class="mi">01</span> <span class="o">-</span><span class="mf">1.457585</span> <span class="o">-</span><span class="mf">1.118629</span>
<span class="mi">1997</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">01</span> <span class="o">-</span><span class="mf">1.453304</span> <span class="o">-</span><span class="mf">1.134642</span>
<span class="mi">1997</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">01</span> <span class="o">-</span><span class="mf">1.442603</span> <span class="o">-</span><span class="mf">1.238745</span>
<span class="mi">1997</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mi">01</span> <span class="o">-</span><span class="mf">1.440463</span> <span class="o">-</span><span class="mf">1.389991</span>
<span class="mi">1997</span><span class="o">-</span><span class="mi">11</span><span class="o">-</span><span class="mi">01</span> <span class="o">-</span><span class="mf">1.446884</span> <span class="o">-</span><span class="mf">1.523417</span>
<span class="p">...</span>              <span class="p">...</span>       <span class="p">...</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">1.621952</span>  <span class="mf">1.101293</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">1.647585</span>  <span class="mf">1.191072</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">1.666210</span>  <span class="mf">1.133288</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">1.691445</span>  <span class="mf">0.528103</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">1.700074</span>  <span class="mf">0.628086</span>

<span class="p">[</span><span class="mi">274</span> <span class="n">rows</span> <span class="n">x</span> <span class="mi">2</span> <span class="n">columns</span><span class="p">]</span>
</code></pre></div></div>

<p>1997년부터 현재까지 비교된 상관관계 지수는 0.9 로, <code class="language-plaintext highlighter-rouge">서울 부동산</code>과 <code class="language-plaintext highlighter-rouge">코스피 지수</code>는 매우 밀접한 관계를 갖고 있는 것을 확인할 수 있었다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">print</span><span class="p">(</span><span class="n">df</span><span class="p">.</span><span class="n">corr</span><span class="p">())</span>
</code></pre></div></div>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>          <span class="n">housing</span>     <span class="n">kospi</span>
<span class="n">housing</span>  <span class="mf">1.000000</span>  <span class="mf">0.914871</span>
<span class="n">kospi</span>    <span class="mf">0.914871</span>  <span class="mf">1.000000</span>
</code></pre></div></div>

<h2 id="구간별로-비교하여도-강한-상관관계를-가지는가">구간별로 비교하여도 강한 상관관계를 가지는가?</h2>

<p>아래의 코드로 두 개의 그래프가 그려지는데, 첫 번째 그래프는 <code class="language-plaintext highlighter-rouge">서울 부동산</code>과 <code class="language-plaintext highlighter-rouge">코스피 지수</code>의 등락을 함께 표시하였고, 두 번째 그래프는 시간의 흐름에 따른 상관관계의 변화를 표시하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ax1</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">211</span><span class="p">)</span>
<span class="n">df</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">ax</span> <span class="o">=</span> <span class="n">ax1</span><span class="p">,</span> <span class="n">grid</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>

<span class="n">ax2</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">212</span><span class="p">)</span>
<span class="n">df_corr</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">df</span><span class="p">[</span><span class="s">'kospi'</span><span class="p">].</span><span class="n">rolling</span><span class="p">(</span><span class="mi">10</span><span class="p">).</span><span class="n">corr</span><span class="p">(</span><span class="n">df</span><span class="p">[</span><span class="s">'housing'</span><span class="p">]),</span> <span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s">'correlation'</span><span class="p">])</span>
<span class="n">df_corr</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">ax</span><span class="o">=</span><span class="n">ax2</span><span class="p">,</span> <span class="n">grid</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>

<span class="n">plt</span><span class="p">.</span><span class="n">tight_layout</span><span class="p">(</span><span class="bp">True</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/img/posts/20200502/correlation-housing-and-kospi.png" alt="correlation-housing-and-kospi" /></p>

<p>2015년 이후의 데이터로 비교한 결과는 다음과 같다.</p>

<p><img src="/assets/img/posts/20200502/correlation-housing-and-kospi_5y.png" alt="correlation-housing-and-kospi-5y" /></p>

<p>첫 번째 그래프에서 긴 시간을 놓고 보면 두 지수가 밀접한 관계를 가지는 반면, 두번째 <code class="language-plaintext highlighter-rouge">correlation</code> 그래프에서는 밀접한 관계를 가질 때도 있고 역관계를 가질 때도 있는 시기를 반복하며 시간의 흐름에 따라 관계성이 달라지는 것을 확인할 수 있었다.</p>

<h2 id="주식과-부동산-어느-것이-더-투자에-유리한가">주식과 부동산 어느 것이 더 투자에 유리한가?</h2>

<p>긴 시간 동안 두 지수가 0.9 이상의 상관 관계를 갖는다는 것은 결국 물가 상승률 만큼 혹은 그 이상으로 동일하게 상승하고 있다는 것을 의미하고, 구간별로 상관 관계가 변화한다는 것은 둘 중 하나의 지수가 상대적으로 더 비싸고, 싸고를 반복하며 주기를 이룬다는 의미가 될 수도 있을 것 같다.</p>

<p>그래서 정규화된 두 지수의 차를 계산하여 그래프로 시각화하여 확인하였다.</p>

<p><code class="language-plaintext highlighter-rouge">코스피 지수</code>에서 <code class="language-plaintext highlighter-rouge">서울 부동산</code> 지수의 차를 계산하였으므로 양수일 경우 <code class="language-plaintext highlighter-rouge">서울 부동산</code>이 가격적으로 더 저렴하여 투자에 더 용이하고, 반대로 음수일 경우 <code class="language-plaintext highlighter-rouge">코스피 지수</code>가 더 투자에 유리하다고 볼 수 있을 것 같다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ax1</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">211</span><span class="p">)</span>
<span class="n">df</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">ax</span><span class="o">=</span><span class="n">ax1</span><span class="p">,</span> <span class="n">grid</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>

<span class="n">ax2</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplot</span><span class="p">(</span><span class="mi">212</span><span class="p">)</span>
<span class="n">df_diff</span> <span class="o">=</span> <span class="n">df</span><span class="p">[</span><span class="s">'kospi'</span><span class="p">]</span> <span class="o">-</span> <span class="n">df</span><span class="p">[</span><span class="s">'housing'</span><span class="p">]</span>
<span class="n">ax2</span><span class="p">.</span><span class="n">plot</span><span class="p">(</span><span class="n">df_diff</span><span class="p">.</span><span class="n">index</span><span class="p">,</span> <span class="n">df_diff</span><span class="p">.</span><span class="n">values</span><span class="p">,</span> <span class="s">'-k'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'comparison'</span><span class="p">)</span>
<span class="n">ax2</span><span class="p">.</span><span class="n">fill_between</span><span class="p">(</span><span class="n">df_diff</span><span class="p">.</span><span class="n">index</span><span class="p">,</span> <span class="n">df_diff</span><span class="p">.</span><span class="n">values</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="n">where</span><span class="o">=</span><span class="n">df_diff</span><span class="p">.</span><span class="n">values</span> <span class="o">&gt;</span> <span class="mf">0.0</span><span class="p">,</span> <span class="n">facecolor</span><span class="o">=</span><span class="s">'b'</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.1</span><span class="p">)</span>
<span class="n">ax2</span><span class="p">.</span><span class="n">fill_between</span><span class="p">(</span><span class="n">df_diff</span><span class="p">.</span><span class="n">index</span><span class="p">,</span> <span class="n">df_diff</span><span class="p">.</span><span class="n">values</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="n">where</span><span class="o">=</span><span class="n">df_diff</span><span class="p">.</span><span class="n">values</span> <span class="o">&lt;</span> <span class="mf">0.0</span><span class="p">,</span> <span class="n">facecolor</span><span class="o">=</span><span class="s">'r'</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.1</span><span class="p">)</span>
<span class="n">ax2</span><span class="p">.</span><span class="n">legend</span><span class="p">()</span>
<span class="n">ax2</span><span class="p">.</span><span class="n">grid</span><span class="p">()</span>

<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/img/posts/20200502/comparison-housing-and-kospi.png" alt="comparison-housing-and-kospi" /></p>

<p>두 번째 그래프에서 파란색으로 칠해진 구간이 <code class="language-plaintext highlighter-rouge">부동산</code>이 유리한 구간, 빨간색으로 칠해진 구간이 <code class="language-plaintext highlighter-rouge">주식</code>이 유리한 구간으로 해석해 볼 수 있을 것 같다.</p>

<p>실제로 2009년 빨간색으로 칠해진 구간에는 <code class="language-plaintext highlighter-rouge">리먼 브러더스 글로벌 금융 위기</code>로 인해 주가가 폭락하여 <code class="language-plaintext highlighter-rouge">주식</code>이 유리한 구간이 있었고, 2014년 이후 파란색으로 칠해진 구간에는 부동산 가격이 급등했던 시기로 <code class="language-plaintext highlighter-rouge">부동산</code>이 유리한 구간이었다고 볼 수 있다.</p>

<p>최근에는 코로나 바이러스로 인해 주가가 폭락하여 2009년 <code class="language-plaintext highlighter-rouge">리먼 브러더스 글로벌 금융 위기</code>때 만큼 하락하여 <code class="language-plaintext highlighter-rouge">주식</code>에 유리한 구간이 되어 있다.</p>

<p>향후 이 그래프의 방향성에 대해 주시해보는 것도 흐름을 예측하는 데 도움이 될 것 같다.</p>]]></content><author><name>Jonghyun Ho</name></author><category term="Data" /><category term="Analysis" /><category term="Crawling" /><category term="Python" /><category term="Pandas" /><category term="DataFrame" /><category term="KOSPI" /><category term="부동산" /><category term="아파트" /><category term="매매가격지수" /><summary type="html"><![CDATA[보통 부동산의 가격이 오르면 주식의 가격이 내리고, 부동산의 가격이 내리면 주식의 가격이 오른다는 이야기를 듣곤 한다.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jonghyunho.github.io/posts/20200502/comparison-housing-and-kospi.png" /><media:content medium="image" url="https://jonghyunho.github.io/posts/20200502/comparison-housing-and-kospi.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">주택매매가격 종합지수</title><link href="https://jonghyunho.github.io/data/analysis/housing-purchase-price-composite-indices.html" rel="alternate" type="text/html" title="주택매매가격 종합지수" /><published>2020-05-01T00:00:00+09:00</published><updated>2020-05-01T00:00:00+09:00</updated><id>https://jonghyunho.github.io/data/analysis/housing-purchase-price-composite-indices</id><content type="html" xml:base="https://jonghyunho.github.io/data/analysis/housing-purchase-price-composite-indices.html"><![CDATA[<p>부동산에서 아파트의 가격이 얼마나 오르고 내렸는지를 확인하기 위해 <code class="language-plaintext highlighter-rouge">주택매매가격 종합지수</code>를 참고할 필요가 있다.</p>

<p><code class="language-plaintext highlighter-rouge">KB 부동산</code>과 <code class="language-plaintext highlighter-rouge">통계청</code> 에서 이 지수를 관리하고 있는데, 우선 <code class="language-plaintext highlighter-rouge">KB 부동산</code> 데이터를 이용하여 부동산 가격의 등락을 확인해보고자 한다.</p>

<h2 id="kb-부동산에서-주택매매가격-지수-확인">KB 부동산에서 주택매매가격 지수 확인</h2>

<p><a href="https://onland.kbstar.com/">KB 부동산</a> 에 접속하면 <code class="language-plaintext highlighter-rouge">뉴스/자료실</code>에 <code class="language-plaintext highlighter-rouge">월간 KB주택가격동향</code> 메뉴를 확인할 수 있다.</p>

<p>해당 페이지의 <code class="language-plaintext highlighter-rouge">★시계열 자료 2020년 4월 기준 (1986년 1월 부터)</code> 라는 게시글에 시계열 데이터가 담긴 엑셀 파일을 확인할 수 있는데, 이를 다운로드 받는다.</p>

<p><img src="/assets/img/posts/20200501/kb_onland_liivon.png" alt="kb_onland_liivon" /></p>

<p>참고로 <code class="language-plaintext highlighter-rouge">주간 KB주택시장동향</code> 메뉴의 시계열 자료를 확인하면, 다음과 같이 지역별 주간 등락률을 시각적으로 확인할 수 있다.</p>

<p><img src="/assets/img/posts/20200501/changing_rate_of_apt_purchase_price.png" alt="kb_onland_liivon" /></p>

<h2 id="python-을-이용한-분석">Python 을 이용한 분석</h2>

<p>필요한 모듈을 임포트 한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">datetime</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">from</span> <span class="nn">matplotlib</span> <span class="kn">import</span> <span class="n">font_manager</span><span class="p">,</span> <span class="n">rc</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">matplotlib</code> 라이브러리를 이용하여 그래프를 그릴 때, 한글 사용이 가능하도록 한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">font_name</span> <span class="o">=</span> <span class="n">font_manager</span><span class="p">.</span><span class="n">FontProperties</span><span class="p">(</span><span class="n">fname</span><span class="o">=</span><span class="s">"c:/Windows/Fonts/malgun.ttf"</span><span class="p">).</span><span class="n">get_name</span><span class="p">()</span>
<span class="n">rc</span><span class="p">(</span><span class="s">'font'</span><span class="p">,</span> <span class="n">family</span><span class="o">=</span><span class="n">font_name</span><span class="p">)</span>
</code></pre></div></div>

<p>다음 코드는 다운로드 받은 엑셀 파일과, 내부 엑셀시트 이름을 전달하면 <code class="language-plaintext highlighter-rouge">pandas DataFrame</code> 을 반환하는 함수이다.</p>

<p><code class="language-plaintext highlighter-rouge">DataFrame</code> 을 반환하기 전에 각 column, row 에 대해 전처리 과정을 거친다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">get_house_price_index</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">sheet</span><span class="p">):</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">read_excel</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="n">sheet_name</span><span class="o">=</span><span class="n">sheet</span><span class="p">,</span> <span class="n">skiprows</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">.</span><span class="n">set_index</span><span class="p">(</span><span class="s">'구분'</span><span class="p">)</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">.</span><span class="n">drop</span><span class="p">(</span><span class="s">'Classification'</span><span class="p">)</span>

	<span class="c1"># reorganize columns
</span>    <span class="n">bignames</span> <span class="o">=</span> <span class="s">'서울 대구 부산 대전 광주 인천 울산 세종 경기 강원 충북 충남 전북 전남 경북 경남 제주도 6개광역시 5개광역시 수도권 기타지방 구분 전국'</span>
    <span class="n">bignames</span> <span class="o">=</span> <span class="n">bignames</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="s">' '</span><span class="p">)</span>
    <span class="n">big_col</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">columns</span><span class="p">)</span>
    <span class="n">small_col</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">iloc</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>

    <span class="k">for</span> <span class="n">num</span><span class="p">,</span> <span class="n">small</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">small_col</span><span class="p">):</span>
        <span class="k">if</span> <span class="nb">str</span><span class="p">(</span><span class="n">small</span><span class="p">)</span> <span class="o">==</span> <span class="s">'nan'</span><span class="p">:</span>
            <span class="n">small_col</span><span class="p">[</span><span class="n">num</span><span class="p">]</span> <span class="o">=</span> <span class="n">big_col</span><span class="p">[</span><span class="n">num</span><span class="p">]</span>

        <span class="n">check</span> <span class="o">=</span> <span class="n">num</span>
        <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
            <span class="k">if</span> <span class="n">big_col</span><span class="p">[</span><span class="n">check</span><span class="p">]</span> <span class="ow">in</span> <span class="n">bignames</span><span class="p">:</span>
                <span class="n">big_col</span><span class="p">[</span><span class="n">num</span><span class="p">]</span> <span class="o">=</span> <span class="n">big_col</span><span class="p">[</span><span class="n">check</span><span class="p">]</span>
                <span class="k">break</span>
            <span class="k">else</span><span class="p">:</span>
                <span class="n">check</span> <span class="o">-=</span> <span class="mi">1</span>

    <span class="n">data</span><span class="p">.</span><span class="n">columns</span> <span class="o">=</span> <span class="p">[</span><span class="n">big_col</span><span class="p">,</span> <span class="n">small_col</span><span class="p">]</span>

	<span class="c1"># reorganize indices
</span>    <span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="n">data</span><span class="p">.</span><span class="n">index</span><span class="p">.</span><span class="n">notnull</span><span class="p">()]</span>
    <span class="n">data</span><span class="p">.</span><span class="n">index</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="s">'date'</span>

    <span class="n">new_index</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">num</span><span class="p">,</span> <span class="n">raw_index</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">index</span><span class="p">):</span>
        <span class="n">raw_index</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">raw_index</span><span class="p">)</span>
        <span class="k">if</span> <span class="n">raw_index</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="s">'.'</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">:</span>
            <span class="n">temp</span> <span class="o">=</span> <span class="n">raw_index</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="s">'.'</span><span class="p">)</span>
            <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">temp</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="o">==</span> <span class="mi">2</span><span class="p">:</span>
                <span class="n">new_index</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="s">'19'</span> <span class="o">+</span> <span class="n">temp</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="s">'.'</span> <span class="o">+</span> <span class="n">temp</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
            <span class="k">else</span><span class="p">:</span>
                <span class="n">new_index</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">raw_index</span><span class="p">)</span>
        <span class="k">else</span><span class="p">:</span>
            <span class="n">new_index</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">new_index</span><span class="p">[</span><span class="n">num</span><span class="o">-</span><span class="mi">1</span><span class="p">].</span><span class="n">split</span><span class="p">(</span><span class="s">'.'</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="s">'.'</span> <span class="o">+</span> <span class="n">raw_index</span><span class="p">)</span>

    <span class="n">data</span><span class="p">.</span><span class="n">set_index</span><span class="p">(</span><span class="n">pd</span><span class="p">.</span><span class="n">to_datetime</span><span class="p">(</span><span class="n">new_index</span><span class="p">),</span> <span class="n">inplace</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">data</span>
</code></pre></div></div>

<p>위에서 구현한 <code class="language-plaintext highlighter-rouge">get_house_price_index</code> 함수를 호출하여, <code class="language-plaintext highlighter-rouge">주택매매가격 종합지수</code>의 모든 데이터를 출력한 결과이다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">data</span> <span class="o">=</span> <span class="n">get_house_price_index</span><span class="p">(</span><span class="s">'datasets/★(월간)KB주택가격동향_시계열(2020.04).xlsx'</span><span class="p">,</span> <span class="s">'매매종합'</span><span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>

                 <span class="n">전국</span>       <span class="n">서울</span>                    <span class="p">...</span>   <span class="n">경남</span>  <span class="n">제주도</span>              <span class="n">기타지방</span>
                 <span class="n">전국</span>       <span class="n">서울</span>       <span class="n">강북</span>      <span class="n">강북구</span>  <span class="p">...</span>   <span class="n">통영</span>  <span class="n">제주도</span> <span class="n">제주</span><span class="o">/</span>\<span class="n">n서귀포</span>     <span class="n">기타지방</span>
<span class="mi">1986</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">34.6561</span>  <span class="mf">30.0438</span>    <span class="mf">41.94</span>      <span class="n">NaN</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>      <span class="n">NaN</span>      <span class="n">NaN</span>
<span class="mi">1986</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">34.6561</span>  <span class="mf">30.0438</span>  <span class="mf">41.8891</span>      <span class="n">NaN</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>      <span class="n">NaN</span>      <span class="n">NaN</span>
<span class="mi">1986</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">01</span>   <span class="mf">34.708</span>  <span class="mf">30.0024</span>  <span class="mf">41.8891</span>      <span class="n">NaN</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>      <span class="n">NaN</span>      <span class="n">NaN</span>
<span class="mi">1986</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">34.4486</span>  <span class="mf">29.8366</span>  <span class="mf">41.7366</span>      <span class="n">NaN</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>      <span class="n">NaN</span>      <span class="n">NaN</span>
<span class="mi">1986</span><span class="o">-</span><span class="mi">05</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">34.2929</span>   <span class="mf">29.588</span>  <span class="mf">41.2791</span>      <span class="n">NaN</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>      <span class="n">NaN</span>      <span class="n">NaN</span>
<span class="p">...</span>             <span class="p">...</span>      <span class="p">...</span>      <span class="p">...</span>      <span class="p">...</span>  <span class="p">...</span>  <span class="p">...</span>  <span class="p">...</span>      <span class="p">...</span>      <span class="p">...</span>
<span class="mi">2019</span><span class="o">-</span><span class="mi">12</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">100.222</span>   <span class="mf">102.56</span>  <span class="mf">102.275</span>  <span class="mf">103.256</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>  <span class="mf">97.1319</span>  <span class="mf">97.5571</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">01</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">100.576</span>  <span class="mf">103.056</span>  <span class="mf">102.687</span>  <span class="mf">103.382</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>  <span class="mf">97.0367</span>  <span class="mf">97.5658</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">02</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">100.948</span>  <span class="mf">103.417</span>  <span class="mf">103.006</span>  <span class="mf">103.488</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>  <span class="mf">96.9613</span>  <span class="mf">97.5925</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">03</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">101.511</span>  <span class="mf">103.905</span>  <span class="mf">103.403</span>  <span class="mf">103.332</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>  <span class="mf">96.8012</span>  <span class="mf">97.6337</span>
<span class="mi">2020</span><span class="o">-</span><span class="mi">04</span><span class="o">-</span><span class="mi">01</span>  <span class="mf">101.744</span>  <span class="mf">104.072</span>  <span class="mf">103.577</span>  <span class="mf">103.194</span>  <span class="p">...</span>  <span class="n">NaN</span>  <span class="n">NaN</span>  <span class="mf">96.7289</span>  <span class="mf">97.6126</span>

<span class="p">[</span><span class="mi">412</span> <span class="n">rows</span> <span class="n">x</span> <span class="mi">186</span> <span class="n">columns</span><span class="p">]</span>
</code></pre></div></div>

<h2 id="지역별로-주택-가격은-어느-정도의-등락-차이가-있을까">지역별로 주택 가격은 어느 정도의 등락 차이가 있을까?</h2>

<p>위에서 생성한 데이터에서 강남, 강북, 수도권 지역만을 필터링하여 하나의 <code class="language-plaintext highlighter-rouge">DataFrame</code>으로 분리하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">DataFrame</span><span class="p">()</span>
<span class="n">df</span><span class="p">[</span><span class="s">'강남'</span><span class="p">]</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s">'서울'</span><span class="p">][</span><span class="s">'강남'</span><span class="p">]</span>
<span class="n">df</span><span class="p">[</span><span class="s">'강북'</span><span class="p">]</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s">'서울'</span><span class="p">][</span><span class="s">'강북'</span><span class="p">]</span>
<span class="n">df</span><span class="p">[</span><span class="s">'수도권'</span><span class="p">]</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s">'수도권'</span><span class="p">][</span><span class="s">'수도권'</span><span class="p">]</span>
</code></pre></div></div>

<p>2015년 1월 1일 이후의 등락률을 표본 데이터로 삼았다.</p>

<p><a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pct_change.html">pct_change 함수</a> 는 현재와 이전 데이터와의 percent 차이를 계산해주고, <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.cumsum.html">cumsum 함수</a> 는 이들 간의 누적합을 계산해준다. 등락률의 누적합을 계산하여 각 지역별 차이를 확인하기 위한 목적이다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">[</span><span class="n">df</span><span class="p">.</span><span class="n">index</span> <span class="o">&gt;</span> <span class="n">datetime</span><span class="p">.</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2015</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)]</span>
<span class="n">df</span> <span class="o">=</span> <span class="n">df</span><span class="p">.</span><span class="n">pct_change</span><span class="p">().</span><span class="n">cumsum</span><span class="p">()</span>
</code></pre></div></div>

<p>그래프로 확인한 결과는 다음과 같다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">df</span><span class="p">.</span><span class="n">plot</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">grid</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p><img src="/assets/img/posts/20200501/housing-price-indices.png" alt="housing-price-indices" /></p>

<p>확인 결과, 서울 강남 &gt; 서울 강북 &gt; 수도권 순으로 상승폭이 컸음을 알 수 있다.</p>

<p>추가적으로 3개 도시 뿐 아니라 모든 도시의 등락률도 확인해 보고자 한다.</p>

<p>2020년 4월 1일과 2015년 1월 1일 사이의 등락률을 계산하고, 데이터를 내림차순으로 정렬하였다.</p>

<p>아래의 분석 결과를 확인해보면 <code class="language-plaintext highlighter-rouge">서울 강남구</code>보다도 <code class="language-plaintext highlighter-rouge">서울 영등포구</code>, <code class="language-plaintext highlighter-rouge">경기 분당구</code>와 같은 도시들의 상승률이 컸음을 알 수 있다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">base_date</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">.</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2015</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">target_date</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">.</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2020</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">.</span><span class="n">loc</span><span class="p">[[</span><span class="n">base_date</span><span class="p">,</span> <span class="n">target_date</span><span class="p">]]</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">.</span><span class="n">T</span>
<span class="n">data</span><span class="p">[</span><span class="s">'result'</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="n">target_date</span><span class="p">]</span> <span class="o">-</span> <span class="n">data</span><span class="p">[</span><span class="n">base_date</span><span class="p">])</span> <span class="o">/</span> <span class="n">data</span><span class="p">[</span><span class="n">base_date</span><span class="p">]</span> <span class="o">*</span> <span class="mf">100.0</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">.</span><span class="n">dropna</span><span class="p">()</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s">'result'</span><span class="p">].</span><span class="n">sort_values</span><span class="p">(</span><span class="n">ascending</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="k">for</span> <span class="n">index</span> <span class="ow">in</span> <span class="n">data</span><span class="p">.</span><span class="n">index</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="n">index</span><span class="p">,</span> <span class="s">'{:.2f}%'</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="n">index</span><span class="p">]))</span>
</code></pre></div></div>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'영등포구'</span><span class="o">)</span> 43.27%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'분당구'</span><span class="o">)</span> 43.01%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'강남구'</span><span class="o">)</span> 41.52%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'광명'</span><span class="o">)</span> 38.57%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'양천구'</span><span class="o">)</span> 34.08%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'서초구'</span><span class="o">)</span> 33.70%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'송파구'</span><span class="o">)</span> 33.11%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'동대문구'</span><span class="o">)</span> 32.20%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'강남'</span><span class="o">)</span> 31.09%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'노원구'</span><span class="o">)</span> 31.05%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'마포구'</span><span class="o">)</span> 30.82%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'성남'</span><span class="o">)</span> 30.36%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'성동구'</span><span class="o">)</span> 29.93%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'동안구'</span><span class="o">)</span> 28.39%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'강동구'</span><span class="o">)</span> 28.14%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'서울'</span><span class="o">)</span> 28.07%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'서대문구'</span><span class="o">)</span> 27.96%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'구로구'</span><span class="o">)</span> 27.96%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'용산구'</span><span class="o">)</span> 27.96%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'동작구'</span><span class="o">)</span> 27.45%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'성북구'</span><span class="o">)</span> 27.11%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'안양'</span><span class="o">)</span> 26.99%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'과천'</span><span class="o">)</span> 26.88%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'강서구'</span><span class="o">)</span> 25.77%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'수지구'</span><span class="o">)</span> 25.66%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'중구'</span><span class="o">)</span> 25.11%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'강북'</span><span class="o">)</span> 25.04%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'만안구'</span><span class="o">)</span> 24.69%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'영통구'</span><span class="o">)</span> 24.62%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'군포'</span><span class="o">)</span> 23.26%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'도봉구'</span><span class="o">)</span> 23.11%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'구리'</span><span class="o">)</span> 22.67%
<span class="o">(</span><span class="s1">'대구'</span>, <span class="s1">'수성구'</span><span class="o">)</span> 22.14%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'금천구'</span><span class="o">)</span> 21.42%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'중랑구'</span><span class="o">)</span> 19.83%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'광진구'</span><span class="o">)</span> 19.76%
<span class="o">(</span><span class="s1">'수도권'</span>, <span class="s1">'수도권'</span><span class="o">)</span> 19.70%
<span class="o">(</span><span class="s1">'광주'</span>, <span class="s1">'광산구'</span><span class="o">)</span> 19.66%
<span class="o">(</span><span class="s1">'광주'</span>, <span class="s1">'서구'</span><span class="o">)</span> 19.45%
<span class="o">(</span><span class="s1">'대전'</span>, <span class="s1">'서구'</span><span class="o">)</span> 19.35%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'관악구'</span><span class="o">)</span> 18.28%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'덕양구'</span><span class="o">)</span> 18.17%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'부천'</span><span class="o">)</span> 17.50%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'수원'</span><span class="o">)</span> 17.37%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'종로구'</span><span class="o">)</span> 16.70%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'장안구'</span><span class="o">)</span> 16.54%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'은평구'</span><span class="o">)</span> 16.34%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'중원구'</span><span class="o">)</span> 16.15%
<span class="o">(</span><span class="s1">'대구'</span>, <span class="s1">'동구'</span><span class="o">)</span> 15.92%
<span class="o">(</span><span class="s1">'대구'</span>, <span class="s1">'중구'</span><span class="o">)</span> 15.53%
<span class="o">(</span><span class="s1">'광주'</span>, <span class="s1">'광주'</span><span class="o">)</span> 15.35%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'용인'</span><span class="o">)</span> 15.31%
<span class="o">(</span><span class="s1">'대전'</span>, <span class="s1">'유성구'</span><span class="o">)</span> 15.12%
<span class="o">(</span><span class="s1">'인천'</span>, <span class="s1">'부평구'</span><span class="o">)</span> 15.12%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'기흥구'</span><span class="o">)</span> 15.08%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'해운대구'</span><span class="o">)</span> 15.07%
<span class="o">(</span><span class="s1">'대구'</span>, <span class="s1">'서구'</span><span class="o">)</span> 14.98%
<span class="o">(</span><span class="s1">'서울'</span>, <span class="s1">'강북구'</span><span class="o">)</span> 14.96%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'경기'</span><span class="o">)</span> 14.87%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'남구'</span><span class="o">)</span> 14.74%
<span class="o">(</span><span class="s1">'대전'</span>, <span class="s1">'대전'</span><span class="o">)</span> 14.20%
<span class="o">(</span><span class="s1">'대전'</span>, <span class="s1">'중구'</span><span class="o">)</span> 14.10%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'팔달구'</span><span class="o">)</span> 13.96%
<span class="o">(</span><span class="s1">'광주'</span>, <span class="s1">'남구'</span><span class="o">)</span> 13.81%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'수영구'</span><span class="o">)</span> 13.47%
<span class="o">(</span><span class="s1">'대구'</span>, <span class="s1">'대구'</span><span class="o">)</span> 13.14%
<span class="o">(</span><span class="s1">'인천'</span>, <span class="s1">'연수구'</span><span class="o">)</span> 13.08%
<span class="o">(</span><span class="s1">'광주'</span>, <span class="s1">'동구'</span><span class="o">)</span> 13.06%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'고양'</span><span class="o">)</span> 13.00%
<span class="o">(</span><span class="s1">'대구'</span>, <span class="s1">'달서구'</span><span class="o">)</span> 12.82%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'수정구'</span><span class="o">)</span> 12.79%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'동래구'</span><span class="o">)</span> 12.76%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'권선구'</span><span class="o">)</span> 12.69%
<span class="o">(</span><span class="s1">'인천'</span>, <span class="s1">'계양구'</span><span class="o">)</span> 12.57%
<span class="o">(</span><span class="s1">'전국'</span>, <span class="s1">'전국'</span><span class="o">)</span> 12.34%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'김포'</span><span class="o">)</span> 12.30%
<span class="o">(</span><span class="s1">'제주도'</span>, <span class="s1">'제주/\n서귀포'</span><span class="o">)</span> 11.74%
<span class="o">(</span><span class="s1">'인천'</span>, <span class="s1">'인천'</span><span class="o">)</span> 11.60%
<span class="o">(</span><span class="s1">'대전'</span>, <span class="s1">'동구'</span><span class="o">)</span> 11.57%
<span class="o">(</span><span class="s1">'인천'</span>, <span class="s1">'서구'</span><span class="o">)</span> 11.37%
<span class="o">(</span><span class="s1">'인천'</span>, <span class="s1">'미추홀구'</span><span class="o">)</span> 11.22%
<span class="o">(</span><span class="s1">'6개광역시'</span>, <span class="s1">'6개광역시'</span><span class="o">)</span> 10.63%
<span class="o">(</span><span class="s1">'울산'</span>, <span class="s1">'5개광역시\n(인천外)'</span><span class="o">)</span> 10.33%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'단원구'</span><span class="o">)</span> 10.21%
<span class="o">(</span><span class="s1">'광주'</span>, <span class="s1">'북구'</span><span class="o">)</span> 10.15%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'남양주'</span><span class="o">)</span> 10.05%
<span class="o">(</span><span class="s1">'인천'</span>, <span class="s1">'남동구'</span><span class="o">)</span> 9.91%
<span class="o">(</span><span class="s1">'대구'</span>, <span class="s1">'남구'</span><span class="o">)</span> 9.88%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'일산동구'</span><span class="o">)</span> 9.60%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'의정부'</span><span class="o">)</span> 9.42%
<span class="o">(</span><span class="s1">'전남'</span>, <span class="s1">'여수'</span><span class="o">)</span> 9.00%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'일산서구'</span><span class="o">)</span> 8.42%
<span class="o">(</span><span class="s1">'전남'</span>, <span class="s1">'순천'</span><span class="o">)</span> 8.30%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'부산'</span><span class="o">)</span> 8.25%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'연제구'</span><span class="o">)</span> 8.21%
<span class="o">(</span><span class="s1">'세종'</span>, <span class="s1">'세종'</span><span class="o">)</span> 7.56%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'안산'</span><span class="o">)</span> 7.45%
<span class="o">(</span><span class="s1">'대구'</span>, <span class="s1">'북구'</span><span class="o">)</span> 7.42%
<span class="o">(</span><span class="s1">'전북'</span>, <span class="s1">'익산'</span><span class="o">)</span> 6.89%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'서구'</span><span class="o">)</span> 6.79%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'시흥'</span><span class="o">)</span> 6.66%
<span class="o">(</span><span class="s1">'전남'</span>, <span class="s1">'전남'</span><span class="o">)</span> 6.63%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'금정구'</span><span class="o">)</span> 6.57%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'파주'</span><span class="o">)</span> 6.22%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'화성'</span><span class="o">)</span> 6.08%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'부산진구'</span><span class="o">)</span> 6.04%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'사상구'</span><span class="o">)</span> 5.12%
<span class="o">(</span><span class="s1">'인천'</span>, <span class="s1">'동구'</span><span class="o">)</span> 5.00%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'동구'</span><span class="o">)</span> 4.83%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'상록구'</span><span class="o">)</span> 4.82%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'사하구'</span><span class="o">)</span> 4.51%
<span class="o">(</span><span class="s1">'인천'</span>, <span class="s1">'중구'</span><span class="o">)</span> 3.63%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'기장군'</span><span class="o">)</span> 3.15%
<span class="o">(</span><span class="s1">'울산'</span>, <span class="s1">'남구'</span><span class="o">)</span> 2.97%
<span class="o">(</span><span class="s1">'대전'</span>, <span class="s1">'대덕구'</span><span class="o">)</span> 2.87%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'북구'</span><span class="o">)</span> 2.84%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'중구'</span><span class="o">)</span> 2.66%
<span class="o">(</span><span class="s1">'전북'</span>, <span class="s1">'덕진구'</span><span class="o">)</span> 2.61%
<span class="o">(</span><span class="s1">'부산'</span>, <span class="s1">'영도구'</span><span class="o">)</span> 2.29%
<span class="o">(</span><span class="s1">'강원'</span>, <span class="s1">'춘천'</span><span class="o">)</span> 1.79%
<span class="o">(</span><span class="s1">'강원'</span>, <span class="s1">'강원'</span><span class="o">)</span> 1.78%
<span class="o">(</span><span class="s1">'대구'</span>, <span class="s1">'달성군'</span><span class="o">)</span> 1.78%
<span class="o">(</span><span class="s1">'전남'</span>, <span class="s1">'목포'</span><span class="o">)</span> 1.77%
<span class="o">(</span><span class="s1">'강원'</span>, <span class="s1">'원주'</span><span class="o">)</span> 1.74%
<span class="o">(</span><span class="s1">'경남'</span>, <span class="s1">'진주'</span><span class="o">)</span> 1.44%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'이천'</span><span class="o">)</span> 1.25%
<span class="o">(</span><span class="s1">'전북'</span>, <span class="s1">'전주'</span><span class="o">)</span> 1.22%
<span class="o">(</span><span class="s1">'전북'</span>, <span class="s1">'전북'</span><span class="o">)</span> 0.75%
<span class="o">(</span><span class="s1">'전북'</span>, <span class="s1">'완산구'</span><span class="o">)</span> 0.11%
<span class="o">(</span><span class="s1">'충남'</span>, <span class="s1">'논산'</span><span class="o">)</span> <span class="nt">-0</span>.75%
<span class="o">(</span><span class="s1">'충남'</span>, <span class="s1">'공주'</span><span class="o">)</span> <span class="nt">-1</span>.03%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'처인구'</span><span class="o">)</span> <span class="nt">-1</span>.13%
<span class="o">(</span><span class="s1">'울산'</span>, <span class="s1">'중구'</span><span class="o">)</span> <span class="nt">-1</span>.19%
<span class="o">(</span><span class="s1">'경북'</span>, <span class="s1">'남구'</span><span class="o">)</span> <span class="nt">-1</span>.64%
<span class="o">(</span><span class="s1">'울산'</span>, <span class="s1">'울주군'</span><span class="o">)</span> <span class="nt">-1</span>.76%
<span class="o">(</span><span class="s1">'기타지방'</span>, <span class="s1">'기타지방'</span><span class="o">)</span> <span class="nt">-2</span>.77%
<span class="o">(</span><span class="s1">'경북'</span>, <span class="s1">'경산'</span><span class="o">)</span> <span class="nt">-2</span>.85%
<span class="o">(</span><span class="s1">'울산'</span>, <span class="s1">'울산'</span><span class="o">)</span> <span class="nt">-2</span>.98%
<span class="o">(</span><span class="s1">'경북'</span>, <span class="s1">'포항'</span><span class="o">)</span> <span class="nt">-3</span>.97%
<span class="o">(</span><span class="s1">'충북'</span>, <span class="s1">'충주'</span><span class="o">)</span> <span class="nt">-4</span>.32%
<span class="o">(</span><span class="s1">'경남'</span>, <span class="s1">'마산회원구'</span><span class="o">)</span> <span class="nt">-5</span>.21%
<span class="o">(</span><span class="s1">'경기'</span>, <span class="s1">'평택'</span><span class="o">)</span> <span class="nt">-5</span>.70%
<span class="o">(</span><span class="s1">'경북'</span>, <span class="s1">'북구'</span><span class="o">)</span> <span class="nt">-5</span>.81%
<span class="o">(</span><span class="s1">'충남'</span>, <span class="s1">'충남'</span><span class="o">)</span> <span class="nt">-6</span>.07%
<span class="o">(</span><span class="s1">'충남'</span>, <span class="s1">'아산'</span><span class="o">)</span> <span class="nt">-6</span>.16%
<span class="o">(</span><span class="s1">'경남'</span>, <span class="s1">'마산합포구'</span><span class="o">)</span> <span class="nt">-6</span>.32%
<span class="o">(</span><span class="s1">'충남'</span>, <span class="s1">'동남구'</span><span class="o">)</span> <span class="nt">-6</span>.78%
<span class="o">(</span><span class="s1">'전북'</span>, <span class="s1">'군산'</span><span class="o">)</span> <span class="nt">-6</span>.79%
<span class="o">(</span><span class="s1">'울산'</span>, <span class="s1">'북구'</span><span class="o">)</span> <span class="nt">-7</span>.16%
<span class="o">(</span><span class="s1">'경북'</span>, <span class="s1">'경북'</span><span class="o">)</span> <span class="nt">-7</span>.17%
<span class="o">(</span><span class="s1">'충북'</span>, <span class="s1">'흥덕구'</span><span class="o">)</span> <span class="nt">-7</span>.39%
<span class="o">(</span><span class="s1">'충북'</span>, <span class="s1">'충북'</span><span class="o">)</span> <span class="nt">-7</span>.48%
<span class="o">(</span><span class="s1">'경남'</span>, <span class="s1">'경남'</span><span class="o">)</span> <span class="nt">-7</span>.62%
<span class="o">(</span><span class="s1">'충남'</span>, <span class="s1">'천안'</span><span class="o">)</span> <span class="nt">-8</span>.09%
<span class="o">(</span><span class="s1">'충북'</span>, <span class="s1">'청주'</span><span class="o">)</span> <span class="nt">-8</span>.42%
<span class="o">(</span><span class="s1">'경남'</span>, <span class="s1">'김해'</span><span class="o">)</span> <span class="nt">-8</span>.85%
<span class="o">(</span><span class="s1">'충남'</span>, <span class="s1">'서북구'</span><span class="o">)</span> <span class="nt">-9</span>.14%
<span class="o">(</span><span class="s1">'경남'</span>, <span class="s1">'진해구'</span><span class="o">)</span> <span class="nt">-9</span>.44%
<span class="o">(</span><span class="s1">'충북'</span>, <span class="s1">'상당구'</span><span class="o">)</span> <span class="nt">-10</span>.16%
<span class="o">(</span><span class="s1">'경남'</span>, <span class="s1">'창원'</span><span class="o">)</span> <span class="nt">-10</span>.32%
<span class="o">(</span><span class="s1">'경남'</span>, <span class="s1">'의창구'</span><span class="o">)</span> <span class="nt">-11</span>.69%
<span class="o">(</span><span class="s1">'울산'</span>, <span class="s1">'동구'</span><span class="o">)</span> <span class="nt">-13</span>.13%
<span class="o">(</span><span class="s1">'경북'</span>, <span class="s1">'구미'</span><span class="o">)</span> <span class="nt">-14</span>.19%
<span class="o">(</span><span class="s1">'경남'</span>, <span class="s1">'성산구'</span><span class="o">)</span> <span class="nt">-16</span>.93%
</code></pre></div></div>]]></content><author><name>Jonghyun Ho</name></author><category term="Data" /><category term="Analysis" /><category term="Crawling" /><category term="Python" /><category term="Pandas" /><category term="DataFrame" /><category term="부동산" /><category term="아파트" /><category term="매매가격지수" /><summary type="html"><![CDATA[부동산에서 아파트의 가격이 얼마나 오르고 내렸는지를 확인하기 위해 주택매매가격 종합지수를 참고할 필요가 있다.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jonghyunho.github.io/posts/20200501/housing-price-indices.png" /><media:content medium="image" url="https://jonghyunho.github.io/posts/20200501/housing-price-indices.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Backtrader 를 이용한 트레이딩 시뮬레이션</title><link href="https://jonghyunho.github.io/data/analysis/Backtrader-%EB%A5%BC-%EC%9D%B4%EC%9A%A9%ED%95%9C-%ED%8A%B8%EB%A0%88%EC%9D%B4%EB%94%A9-%EC%8B%9C%EB%AE%AC%EB%A0%88%EC%9D%B4%EC%85%98.html" rel="alternate" type="text/html" title="Backtrader 를 이용한 트레이딩 시뮬레이션" /><published>2020-04-23T00:00:00+09:00</published><updated>2020-04-23T00:00:00+09:00</updated><id>https://jonghyunho.github.io/data/analysis/Backtrader%20%EB%A5%BC%20%EC%9D%B4%EC%9A%A9%ED%95%9C%20%ED%8A%B8%EB%A0%88%EC%9D%B4%EB%94%A9%20%EC%8B%9C%EB%AE%AC%EB%A0%88%EC%9D%B4%EC%85%98</id><content type="html" xml:base="https://jonghyunho.github.io/data/analysis/Backtrader-%EB%A5%BC-%EC%9D%B4%EC%9A%A9%ED%95%9C-%ED%8A%B8%EB%A0%88%EC%9D%B4%EB%94%A9-%EC%8B%9C%EB%AE%AC%EB%A0%88%EC%9D%B4%EC%85%98.html"><![CDATA[<script id="MathJax-script" async="" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js">
</script>

<p>트레이딩을 할 때 투자 전략을 정하고 계획한 전략이 효과적으로 잘 동작하는지에 대해 검증하거나, 얼마나 수익률이 발생하는지 확인할 수 있다면 유용할 것이다.</p>

<p><code class="language-plaintext highlighter-rouge">Backtrader</code> 를 이용하여 전략을 시뮬레이션 해보자.</p>

<h2 id="backtrader">Backtrader</h2>

<p><a href="https://www.backtrader.com/">Backtrader</a> 는 <code class="language-plaintext highlighter-rouge">Python</code> 언어 기반의 트레이딩 백테스트 기능을 제공한다.</p>

<p><a href="https://www.zipline.io/index.html">Zipline</a> 이라는 백테스트 툴도 존재하지만, 최신 버전의 <code class="language-plaintext highlighter-rouge">Python</code> 언어를 지원하지 않아 <code class="language-plaintext highlighter-rouge">Backtrader</code> 를 사용하는 것이 적합할 것 같다.</p>

<p>참고 : <a href="https://www.backtrader.com/">Backtrader</a></p>

<h2 id="backtrader-설치하기">Backtrader 설치하기</h2>

<p>설치는 <a href="https://www.anaconda.com/distribution/">Anaconda</a> <code class="language-plaintext highlighter-rouge">Windows</code> 환경의 <code class="language-plaintext highlighter-rouge">Python 3.7</code> 버전을 사용하였다.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;</span> conda create <span class="nt">-n</span> stock <span class="nv">python</span><span class="o">=</span>3.7
<span class="o">&gt;</span> conda activate stock
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">backtrader</code>를 포함하여 필요한 모듈을 설치한다.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">(</span>stock<span class="o">)</span> <span class="o">&gt;</span> pip <span class="nb">install </span>backtrader requests matplotlib
</code></pre></div></div>

<h2 id="이동평균선을-활용한-전략은-유효한가">이동평균선을 활용한 전략은 유효한가?</h2>

<p><a href="https://jonghyunho.github.io/data/crawling/%EC%BD%94%EC%8A%A4%ED%94%BC-%EC%A7%80%EC%88%98-%EC%9D%B4%EB%8F%99%ED%8F%89%EA%B7%A0%EC%84%A0.html">코스피 지수 이동평균선</a> 에서 <code class="language-plaintext highlighter-rouge">골든크로스</code>와 <code class="language-plaintext highlighter-rouge">데드크로스</code>에 대해 언급하였다.</p>

<p><code class="language-plaintext highlighter-rouge">골든크로스에 사고 데드크로스에 팔아라</code> 라는 전략은 얼마나 유효한지 <code class="language-plaintext highlighter-rouge">Backtrader</code>를 이용해 확인해 보려고 한다.</p>

<p><code class="language-plaintext highlighter-rouge">backtrader</code>를 포함하여 필요한 모듈을 임포트 한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>
<span class="kn">import</span> <span class="nn">backtrader</span> <span class="k">as</span> <span class="n">bt</span>
<span class="kn">import</span> <span class="nn">locale</span>

<span class="n">locale</span><span class="p">.</span><span class="n">setlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_ALL</span><span class="p">,</span> <span class="s">'ko_KR'</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">backtrader</code> 의 <code class="language-plaintext highlighter-rouge">Strategy</code> 클래스를 상속받아 분석에 필요한 지표와 로직을 구현한다. <code class="language-plaintext highlighter-rouge">5일 이동평균선</code>과 <code class="language-plaintext highlighter-rouge">30일 이동평균선</code>을 지표로 사용할 예정이다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Create a subclass of Strategy to define the indicators and logic
</span><span class="k">class</span> <span class="nc">SmaCross</span><span class="p">(</span><span class="n">bt</span><span class="p">.</span><span class="n">Strategy</span><span class="p">):</span>
    <span class="c1"># list of parameters which are configurable for the strategy
</span>    <span class="n">params</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span>
        <span class="n">pfast</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>  <span class="c1"># period for the fast moving average
</span>        <span class="n">pslow</span><span class="o">=</span><span class="mi">30</span>  <span class="c1"># period for the slow moving average
</span>    <span class="p">)</span>
</code></pre></div></div>

<p>클래스 초기화 부분에는 두 개의 <code class="language-plaintext highlighter-rouge">이동평균선</code>을 이용하여 <code class="language-plaintext highlighter-rouge">CrossOver</code> 시그널을 만든다. <code class="language-plaintext highlighter-rouge">0</code>보다 크면 <code class="language-plaintext highlighter-rouge">골든크로스</code>, <code class="language-plaintext highlighter-rouge">0</code>보다 작으면 <code class="language-plaintext highlighter-rouge">데드크로스</code>를 의미한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="n">sma1</span> <span class="o">=</span> <span class="n">bt</span><span class="p">.</span><span class="n">ind</span><span class="p">.</span><span class="n">SMA</span><span class="p">(</span><span class="n">period</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">p</span><span class="p">.</span><span class="n">pfast</span><span class="p">)</span>  <span class="c1"># fast moving average
</span>        <span class="n">sma2</span> <span class="o">=</span> <span class="n">bt</span><span class="p">.</span><span class="n">ind</span><span class="p">.</span><span class="n">SMA</span><span class="p">(</span><span class="n">period</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">p</span><span class="p">.</span><span class="n">pslow</span><span class="p">)</span>  <span class="c1"># slow moving average
</span>        <span class="bp">self</span><span class="p">.</span><span class="n">crossover</span> <span class="o">=</span> <span class="n">bt</span><span class="p">.</span><span class="n">ind</span><span class="p">.</span><span class="n">CrossOver</span><span class="p">(</span><span class="n">sma1</span><span class="p">,</span> <span class="n">sma2</span><span class="p">)</span>  <span class="c1"># crossover signal
</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">holding</span> <span class="o">=</span> <span class="mi">0</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">next</code> 함수는 지정된 기간 동안 액션을 취하기 위해 순차적으로 호출되는 함수이다.</p>

<p>현재의 주가를 얻어오고, 매수자의 현금 잔액을 얻어오면 매수 가능한 주식의 수를 알 수 있는데, <code class="language-plaintext highlighter-rouge">available_stocks</code>에 그 수를 저장하였다.</p>

<p><code class="language-plaintext highlighter-rouge">buy</code> 함수를 호출할 때 <code class="language-plaintext highlighter-rouge">available_stocks</code> 를 인자로 전달하면 전량 매수가 되겠지만 예제에서는 1주씩 매수, 매도하기로 한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">def</span> <span class="nf">next</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="n">current_stock_price</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">close</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>

        <span class="k">if</span> <span class="ow">not</span> <span class="bp">self</span><span class="p">.</span><span class="n">position</span><span class="p">:</span>  <span class="c1"># not in the market
</span>            <span class="k">if</span> <span class="bp">self</span><span class="p">.</span><span class="n">crossover</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>  <span class="c1"># if fast crosses slow to the upside
</span>                <span class="n">available_stocks</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">getcash</span><span class="p">()</span> <span class="o">/</span> <span class="n">current_stock_price</span>
                <span class="bp">self</span><span class="p">.</span><span class="n">buy</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">데드크로스</code>의 경우에는 <code class="language-plaintext highlighter-rouge">close</code> 함수를 호출하여 전량 매도하도록 하였다. <code class="language-plaintext highlighter-rouge">sell</code> 함수를 사용하면 매도하고자 하는 주식의 수를 지정할 수 있다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        <span class="k">elif</span> <span class="bp">self</span><span class="p">.</span><span class="n">crossover</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">:</span>  <span class="c1"># in the market &amp; cross to the downside
</span>            <span class="bp">self</span><span class="p">.</span><span class="n">close</span><span class="p">()</span>  <span class="c1"># close long position
</span></code></pre></div></div>

<p>주문이 체결될 때 <code class="language-plaintext highlighter-rouge">notify_order</code> 함수가 호출되는데, 주문이 발생할 때마다 <code class="language-plaintext highlighter-rouge">매수</code>, <code class="language-plaintext highlighter-rouge">매도</code>, <code class="language-plaintext highlighter-rouge">주식 가격</code>, <code class="language-plaintext highlighter-rouge">보유 현금</code>, <code class="language-plaintext highlighter-rouge">자산 가치</code>, <code class="language-plaintext highlighter-rouge">보유 주식의 수</code> 등의 로그를 출력하도록 하였다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">def</span> <span class="nf">notify_order</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">order</span><span class="p">):</span>
        <span class="k">if</span> <span class="n">order</span><span class="p">.</span><span class="n">status</span> <span class="ow">not</span> <span class="ow">in</span> <span class="p">[</span><span class="n">order</span><span class="p">.</span><span class="n">Completed</span><span class="p">]:</span>
            <span class="k">return</span>

        <span class="k">if</span> <span class="n">order</span><span class="p">.</span><span class="n">isbuy</span><span class="p">():</span>
            <span class="n">action</span> <span class="o">=</span> <span class="s">'Buy'</span>
        <span class="k">elif</span> <span class="n">order</span><span class="p">.</span><span class="n">issell</span><span class="p">():</span>
            <span class="n">action</span> <span class="o">=</span> <span class="s">'Sell'</span>

        <span class="n">stock_price</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">close</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
        <span class="n">cash</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">getcash</span><span class="p">()</span>
        <span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">getvalue</span><span class="p">()</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">holding</span> <span class="o">+=</span> <span class="n">order</span><span class="p">.</span><span class="n">size</span>

        <span class="k">print</span><span class="p">(</span><span class="s">'%s[%d] holding[%d] price[%d] cash[%.2f] value[%.2f]'</span>
              <span class="o">%</span> <span class="p">(</span><span class="n">action</span><span class="p">,</span> <span class="nb">abs</span><span class="p">(</span><span class="n">order</span><span class="p">.</span><span class="n">size</span><span class="p">),</span> <span class="bp">self</span><span class="p">.</span><span class="n">holding</span><span class="p">,</span> <span class="n">stock_price</span><span class="p">,</span> <span class="n">cash</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">Cerebro</code> 엔진을 생성하고, 초기 현금과 수수료를 설정한다. 0.002 는 0.2% 수수료를 설정한 것을 의미한다.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cerebro = bt.Cerebro()  # create a "Cerebro" engine instance
cerebro.broker.setcash(100000)
cerebro.broker.setcommission(0.002)
</code></pre></div></div>

<p><a href="https://finance.yahoo.com/quote/005930.KS">삼성전자</a> 주가를 사용하고, <code class="language-plaintext highlighter-rouge">Yahoo Finance</code> 에서 데이터를 얻어오도록 하였다.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Create a data feed
data = bt.feeds.YahooFinanceData(dataname='005930.KS',
                                 fromdate=datetime(2019, 1, 1),
                                 todate=datetime.now())

cerebro.adddata(data)  # Add the data feed

cerebro.addstrategy(SmaCross)  # Add the trading strategy
</code></pre></div></div>

<p>시뮬레이션을 실행하고 결과를 확인한다.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>start_value = cerebro.broker.getvalue()
cerebro.run()  # run it all
final_value = cerebro.broker.getvalue()

print('* start value : %s won' % locale.format_string('%d', start_value, grouping=True))
print('* final value : %s won' % locale.format_string('%d', final_value, grouping=True))
print('* earning rate : %.2f %%' % ((final_value - start_value) / start_value * 100.0))

cerebro.plot()  # and plot it with a single command
</code></pre></div></div>

<h2 id="backtrader-에-문제가-있다">Backtrader 에 문제가 있다.</h2>

<p>실행을 해보면 다음과 같이 에러를 발생하며 현재 버전에서 동작하지 않는다. (2020/4/23 기준)</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="n">stock</span><span class="p">)</span> <span class="o">&gt;</span> <span class="n">python</span> <span class="n">sma</span><span class="p">.</span><span class="n">py</span>
<span class="n">Traceback</span> <span class="p">(</span><span class="n">most</span> <span class="n">recent</span> <span class="n">call</span> <span class="n">last</span><span class="p">):</span>
  <span class="n">File</span> <span class="s">"sma.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">64</span><span class="p">,</span> <span class="ow">in</span> <span class="o">&lt;</span><span class="n">module</span><span class="o">&gt;</span>
    <span class="n">cerebro</span><span class="p">.</span><span class="n">run</span><span class="p">()</span>  <span class="c1"># run it all
</span>  <span class="n">File</span> <span class="s">"C:\Anaconda3\envs\stock\lib\site-packages</span><span class="se">\b</span><span class="s">acktrader\cerebro.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">1127</span><span class="p">,</span> <span class="ow">in</span> <span class="n">run</span>
    <span class="n">runstrat</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">runstrategies</span><span class="p">(</span><span class="n">iterstrat</span><span class="p">)</span>
  <span class="n">File</span> <span class="s">"C:\Anaconda3\envs\stock\lib\site-packages</span><span class="se">\b</span><span class="s">acktrader\cerebro.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">1210</span><span class="p">,</span> <span class="ow">in</span> <span class="n">runstrategies</span>
    <span class="n">data</span><span class="p">.</span><span class="n">_start</span><span class="p">()</span>
  <span class="n">File</span> <span class="s">"C:\Anaconda3\envs\stock\lib\site-packages</span><span class="se">\b</span><span class="s">acktrader</span><span class="se">\f</span><span class="s">eed.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">203</span><span class="p">,</span> <span class="ow">in</span> <span class="n">_start</span>
    <span class="bp">self</span><span class="p">.</span><span class="n">start</span><span class="p">()</span>
  <span class="n">File</span> <span class="s">"C:\Anaconda3\envs\stock\lib\site-packages</span><span class="se">\b</span><span class="s">acktrader</span><span class="se">\f</span><span class="s">eeds\yahoo.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">352</span><span class="p">,</span> <span class="ow">in</span> <span class="n">start</span>
    <span class="nb">super</span><span class="p">(</span><span class="n">YahooFinanceData</span><span class="p">,</span> <span class="bp">self</span><span class="p">).</span><span class="n">start</span><span class="p">()</span>
  <span class="n">File</span> <span class="s">"C:\Anaconda3\envs\stock\lib\site-packages</span><span class="se">\b</span><span class="s">acktrader</span><span class="se">\f</span><span class="s">eeds\yahoo.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">94</span><span class="p">,</span> <span class="ow">in</span> <span class="n">start</span>
    <span class="nb">super</span><span class="p">(</span><span class="n">YahooFinanceCSVData</span><span class="p">,</span> <span class="bp">self</span><span class="p">).</span><span class="n">start</span><span class="p">()</span>
  <span class="n">File</span> <span class="s">"C:\Anaconda3\envs\stock\lib\site-packages</span><span class="se">\b</span><span class="s">acktrader</span><span class="se">\f</span><span class="s">eed.py"</span><span class="p">,</span> <span class="n">line</span> <span class="mi">674</span><span class="p">,</span> <span class="ow">in</span> <span class="n">start</span>
    <span class="bp">self</span><span class="p">.</span><span class="n">f</span> <span class="o">=</span> <span class="n">io</span><span class="p">.</span><span class="nb">open</span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="n">p</span><span class="p">.</span><span class="n">dataname</span><span class="p">,</span> <span class="s">'r'</span><span class="p">)</span>
<span class="nb">FileNotFoundError</span><span class="p">:</span> <span class="p">[</span><span class="n">Errno</span> <span class="mi">2</span><span class="p">]</span> <span class="n">No</span> <span class="n">such</span> <span class="nb">file</span> <span class="ow">or</span> <span class="n">directory</span><span class="p">:</span> <span class="s">'005930.KS'</span>
</code></pre></div></div>

<p><a href="https://community.backtrader.com/topic/2363/errno-2-no-such-file-or-directory">Backtrader Community</a>의 글에 따르면 <code class="language-plaintext highlighter-rouge">yahoo</code> API 의 응답이 변경되어 발생하는 것으로 <code class="language-plaintext highlighter-rouge">yahoo.py</code> 파일의 수정이 필요하다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>            <span class="n">ctype</span> <span class="o">=</span> <span class="n">resp</span><span class="p">.</span><span class="n">headers</span><span class="p">[</span><span class="s">'Content-Type'</span><span class="p">]</span>
            <span class="k">if</span> <span class="s">'text/csv'</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">ctype</span><span class="p">:</span>
                <span class="bp">self</span><span class="p">.</span><span class="n">error</span> <span class="o">=</span> <span class="s">'Wrong content type: %s'</span> <span class="o">%</span> <span class="n">ctype</span>
                <span class="k">continue</span>  <span class="c1"># HTML returned? wrong url?
</span></code></pre></div></div>

<p>위의 코드를 다음과 같이 수정해 주어야 한다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">diff</span> <span class="o">--</span><span class="n">git</span> <span class="n">a</span><span class="o">/</span><span class="n">backtrader</span><span class="o">/</span><span class="n">feeds</span><span class="o">/</span><span class="n">yahoo</span><span class="p">.</span><span class="n">py</span> <span class="n">b</span><span class="o">/</span><span class="n">backtrader</span><span class="o">/</span><span class="n">feeds</span><span class="o">/</span><span class="n">yahoo</span><span class="p">.</span><span class="n">py</span>
<span class="n">index</span> <span class="n">abfe97d</span><span class="p">..</span><span class="n">bd1f6ea</span> <span class="mi">100644</span>
<span class="o">---</span> <span class="n">a</span><span class="o">/</span><span class="n">backtrader</span><span class="o">/</span><span class="n">feeds</span><span class="o">/</span><span class="n">yahoo</span><span class="p">.</span><span class="n">py</span>
<span class="o">+++</span> <span class="n">b</span><span class="o">/</span><span class="n">backtrader</span><span class="o">/</span><span class="n">feeds</span><span class="o">/</span><span class="n">yahoo</span><span class="p">.</span><span class="n">py</span>
<span class="o">@@</span> <span class="o">-</span><span class="mi">330</span><span class="p">,</span><span class="mi">7</span> <span class="o">+</span><span class="mi">330</span><span class="p">,</span><span class="mi">7</span> <span class="o">@@</span> <span class="k">class</span> <span class="nc">YahooFinanceData</span><span class="p">(</span><span class="n">YahooFinanceCSVData</span><span class="p">):</span>
                 <span class="k">continue</span>

             <span class="n">ctype</span> <span class="o">=</span> <span class="n">resp</span><span class="p">.</span><span class="n">headers</span><span class="p">[</span><span class="s">'Content-Type'</span><span class="p">]</span>
<span class="o">-</span>            <span class="k">if</span> <span class="s">'text/csv'</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">ctype</span><span class="p">:</span>
<span class="o">+</span>            <span class="k">if</span> <span class="n">ctype</span> <span class="ow">not</span> <span class="ow">in</span> <span class="p">[</span><span class="s">'text/csv'</span><span class="p">,</span> <span class="s">'text/plain'</span><span class="p">]:</span>
                 <span class="bp">self</span><span class="p">.</span><span class="n">error</span> <span class="o">=</span> <span class="s">'Wrong content type: %s'</span> <span class="o">%</span> <span class="n">ctype</span>
                 <span class="k">continue</span>  <span class="c1"># HTML returned? wrong url?
</span></code></pre></div></div>

<h2 id="전체-코드">전체 코드</h2>

<p>위의 코드 조각의 전체 코드는 다음과 같다.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>
<span class="kn">import</span> <span class="nn">backtrader</span> <span class="k">as</span> <span class="n">bt</span>
<span class="kn">import</span> <span class="nn">locale</span>

<span class="n">locale</span><span class="p">.</span><span class="n">setlocale</span><span class="p">(</span><span class="n">locale</span><span class="p">.</span><span class="n">LC_ALL</span><span class="p">,</span> <span class="s">'ko_KR'</span><span class="p">)</span>

<span class="c1"># Create a subclass of Strategy to define the indicators and logic
</span><span class="k">class</span> <span class="nc">SmaCross</span><span class="p">(</span><span class="n">bt</span><span class="p">.</span><span class="n">Strategy</span><span class="p">):</span>
    <span class="c1"># list of parameters which are configurable for the strategy
</span>    <span class="n">params</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span>
        <span class="n">pfast</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>  <span class="c1"># period for the fast moving average
</span>        <span class="n">pslow</span><span class="o">=</span><span class="mi">30</span>  <span class="c1"># period for the slow moving average
</span>    <span class="p">)</span>

    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="n">sma1</span> <span class="o">=</span> <span class="n">bt</span><span class="p">.</span><span class="n">ind</span><span class="p">.</span><span class="n">SMA</span><span class="p">(</span><span class="n">period</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">p</span><span class="p">.</span><span class="n">pfast</span><span class="p">)</span>  <span class="c1"># fast moving average
</span>        <span class="n">sma2</span> <span class="o">=</span> <span class="n">bt</span><span class="p">.</span><span class="n">ind</span><span class="p">.</span><span class="n">SMA</span><span class="p">(</span><span class="n">period</span><span class="o">=</span><span class="bp">self</span><span class="p">.</span><span class="n">p</span><span class="p">.</span><span class="n">pslow</span><span class="p">)</span>  <span class="c1"># slow moving average
</span>        <span class="bp">self</span><span class="p">.</span><span class="n">crossover</span> <span class="o">=</span> <span class="n">bt</span><span class="p">.</span><span class="n">ind</span><span class="p">.</span><span class="n">CrossOver</span><span class="p">(</span><span class="n">sma1</span><span class="p">,</span> <span class="n">sma2</span><span class="p">)</span>  <span class="c1"># crossover signal
</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">holding</span> <span class="o">=</span> <span class="mi">0</span>

    <span class="k">def</span> <span class="nf">next</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="n">current_stock_price</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">close</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>

        <span class="k">if</span> <span class="ow">not</span> <span class="bp">self</span><span class="p">.</span><span class="n">position</span><span class="p">:</span>  <span class="c1"># not in the market
</span>            <span class="k">if</span> <span class="bp">self</span><span class="p">.</span><span class="n">crossover</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>  <span class="c1"># if fast crosses slow to the upside
</span>                <span class="n">available_stocks</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">getcash</span><span class="p">()</span> <span class="o">/</span> <span class="n">current_stock_price</span>
                <span class="bp">self</span><span class="p">.</span><span class="n">buy</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>

        <span class="k">elif</span> <span class="bp">self</span><span class="p">.</span><span class="n">crossover</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">:</span>  <span class="c1"># in the market &amp; cross to the downside
</span>            <span class="bp">self</span><span class="p">.</span><span class="n">close</span><span class="p">()</span>  <span class="c1"># close long position
</span>
    <span class="k">def</span> <span class="nf">notify_order</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">order</span><span class="p">):</span>
        <span class="k">if</span> <span class="n">order</span><span class="p">.</span><span class="n">status</span> <span class="ow">not</span> <span class="ow">in</span> <span class="p">[</span><span class="n">order</span><span class="p">.</span><span class="n">Completed</span><span class="p">]:</span>
            <span class="k">return</span>

        <span class="k">if</span> <span class="n">order</span><span class="p">.</span><span class="n">isbuy</span><span class="p">():</span>
            <span class="n">action</span> <span class="o">=</span> <span class="s">'Buy'</span>
        <span class="k">elif</span> <span class="n">order</span><span class="p">.</span><span class="n">issell</span><span class="p">():</span>
            <span class="n">action</span> <span class="o">=</span> <span class="s">'Sell'</span>

        <span class="n">stock_price</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">close</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
        <span class="n">cash</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">getcash</span><span class="p">()</span>
        <span class="n">value</span> <span class="o">=</span> <span class="bp">self</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">getvalue</span><span class="p">()</span>
        <span class="bp">self</span><span class="p">.</span><span class="n">holding</span> <span class="o">+=</span> <span class="n">order</span><span class="p">.</span><span class="n">size</span>

        <span class="k">print</span><span class="p">(</span><span class="s">'%s[%d] holding[%d] price[%d] cash[%.2f] value[%.2f]'</span>
              <span class="o">%</span> <span class="p">(</span><span class="n">action</span><span class="p">,</span> <span class="nb">abs</span><span class="p">(</span><span class="n">order</span><span class="p">.</span><span class="n">size</span><span class="p">),</span> <span class="bp">self</span><span class="p">.</span><span class="n">holding</span><span class="p">,</span> <span class="n">stock_price</span><span class="p">,</span> <span class="n">cash</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span>

<span class="n">cerebro</span> <span class="o">=</span> <span class="n">bt</span><span class="p">.</span><span class="n">Cerebro</span><span class="p">()</span>  <span class="c1"># create a "Cerebro" engine instance
</span><span class="n">cerebro</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">setcash</span><span class="p">(</span><span class="mi">100000</span><span class="p">)</span>
<span class="n">cerebro</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">setcommission</span><span class="p">(</span><span class="mf">0.002</span><span class="p">)</span>

<span class="c1"># Create a data feed
</span><span class="n">data</span> <span class="o">=</span> <span class="n">bt</span><span class="p">.</span><span class="n">feeds</span><span class="p">.</span><span class="n">YahooFinanceData</span><span class="p">(</span><span class="n">dataname</span><span class="o">=</span><span class="s">'005930.KS'</span><span class="p">,</span>
                                 <span class="n">fromdate</span><span class="o">=</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2019</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span>
                                 <span class="n">todate</span><span class="o">=</span><span class="n">datetime</span><span class="p">.</span><span class="n">now</span><span class="p">())</span>

<span class="n">cerebro</span><span class="p">.</span><span class="n">adddata</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>  <span class="c1"># Add the data feed
</span>
<span class="n">cerebro</span><span class="p">.</span><span class="n">addstrategy</span><span class="p">(</span><span class="n">SmaCross</span><span class="p">)</span>  <span class="c1"># Add the trading strategy
</span>
<span class="n">start_value</span> <span class="o">=</span> <span class="n">cerebro</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">getvalue</span><span class="p">()</span>
<span class="n">cerebro</span><span class="p">.</span><span class="n">run</span><span class="p">()</span>  <span class="c1"># run it all
</span><span class="n">final_value</span> <span class="o">=</span> <span class="n">cerebro</span><span class="p">.</span><span class="n">broker</span><span class="p">.</span><span class="n">getvalue</span><span class="p">()</span>

<span class="k">print</span><span class="p">(</span><span class="s">'* start value : %s won'</span> <span class="o">%</span> <span class="n">locale</span><span class="p">.</span><span class="n">format_string</span><span class="p">(</span><span class="s">'%d'</span><span class="p">,</span> <span class="n">start_value</span><span class="p">,</span> <span class="n">grouping</span><span class="o">=</span><span class="bp">True</span><span class="p">))</span>
<span class="k">print</span><span class="p">(</span><span class="s">'* final value : %s won'</span> <span class="o">%</span> <span class="n">locale</span><span class="p">.</span><span class="n">format_string</span><span class="p">(</span><span class="s">'%d'</span><span class="p">,</span> <span class="n">final_value</span><span class="p">,</span> <span class="n">grouping</span><span class="o">=</span><span class="bp">True</span><span class="p">))</span>
<span class="k">print</span><span class="p">(</span><span class="s">'* earning rate : %.2f %%'</span> <span class="o">%</span> <span class="p">((</span><span class="n">final_value</span> <span class="o">-</span> <span class="n">start_value</span><span class="p">)</span> <span class="o">/</span> <span class="n">start_value</span> <span class="o">*</span> <span class="mf">100.0</span><span class="p">))</span>

<span class="n">cerebro</span><span class="p">.</span><span class="n">plot</span><span class="p">()</span>  <span class="c1"># and plot it with a single command
</span></code></pre></div></div>

<h2 id="시뮬레이션-결과-확인">시뮬레이션 결과 확인</h2>

<p>10만원을 설정하여 시뮬레이션을 해 보았지만, 여러 차례 매수와 매도를 거듭 했음에도 불구하고 수익률은 0.25% 에 불과했다.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">(</span>stock<span class="o">)</span> <span class="o">&gt;</span> python sma.py
Buy[1] holding[1] price[45350] cash[55160.50] value[100510.50]
Sell[1] holding[0] price[45050] cash[100270.10] value[100270.10]
Buy[1] holding[1] price[46950] cash[54027.80] value[100977.80]
Sell[1] holding[0] price[44650] cash[98189.30] value[98189.30]
Buy[1] holding[1] price[44800] cash[53800.70] value[98600.70]
Sell[1] holding[0] price[44950] cash[98261.60] value[98261.60]
Buy[1] holding[1] price[46900] cash[51718.70] value[98618.70]
Sell[1] holding[0] price[51300] cash[103514.90] value[103514.90]
Buy[1] holding[1] price[50300] cash[52212.50] value[102512.50]
Sell[1] holding[0] price[50400] cash[103010.70] value[103010.70]
Buy[1] holding[1] price[54700] cash[48401.70] value[103101.70]
Sell[1] holding[0] price[58900] cash[105387.50] value[105387.50]
Buy[1] holding[1] price[60400] cash[44165.30] value[104565.30]
Sell[1] holding[0] price[57900] cash[100252.90] value[100252.90]
<span class="k">*</span> start value : 100,000 won
<span class="k">*</span> final value : 100,252 won
<span class="k">*</span> earning rate : 0.25 %
</code></pre></div></div>

<p>아래 그래프를 통해 거래 상황을 좀 더 명확히 확인할 수 있다.</p>

<p><img src="/assets/img/posts/20200423/backtrader_sma_simulation.png" alt="Backtrader SMA Test" /></p>

<p><code class="language-plaintext highlighter-rouge">이동평균선</code> 전략이 잘 맞는 상황도 있겠지만 현재 상황에서는 그다지 효율적이지 않은 것을 <code class="language-plaintext highlighter-rouge">Backtrader</code> 를 통해 확인할 수 있었다. <code class="language-plaintext highlighter-rouge">Backtrader</code>를 이용하여 다양한 전략을 구상해보고 적용해보는 재미가 있을 것 같다. 또한, 유용하고 다양한 기능을 제공하는 만큼 <code class="language-plaintext highlighter-rouge">backtrader</code> 의 오픈소스가 잘 관리되기를 기대해본다.</p>]]></content><author><name>Jonghyun Ho</name></author><category term="Data" /><category term="Analysis" /><category term="Stock" /><category term="KOSPI" /><category term="Crawling" /><category term="Python" /><category term="Backtrader" /><category term="Simulation" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://jonghyunho.github.io/posts/20200423/backtrader_sma_simulation.png" /><media:content medium="image" url="https://jonghyunho.github.io/posts/20200423/backtrader_sma_simulation.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>