<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Arquivo de Aprendizagem de máquina - Ramon Domingos Blog</title>
	<atom:link href="https://ramondomingos.com.br/category/aprendizagem-de-maquina/feed/" rel="self" type="application/rss+xml" />
	<link>https://ramondomingos.com.br/category/aprendizagem-de-maquina/</link>
	<description>Conteúdo sobre tecnologia e engenharia de software.</description>
	<lastBuildDate>Tue, 17 Oct 2023 14:32:56 +0000</lastBuildDate>
	<language>pt-BR</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://ramondomingos.com.br/wp-content/uploads/2023/09/cropped-Logotipo_bold_minimalista_amarelo_para_blog__1_-removebg-preview-32x32.png</url>
	<title>Arquivo de Aprendizagem de máquina - Ramon Domingos Blog</title>
	<link>https://ramondomingos.com.br/category/aprendizagem-de-maquina/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Aplicando Machine Learning no dataset sobre Doenças cardíacas</title>
		<link>https://ramondomingos.com.br/aplicando-machine-learning-no-dataset-sobre-doencas-cardiacas/</link>
					<comments>https://ramondomingos.com.br/aplicando-machine-learning-no-dataset-sobre-doencas-cardiacas/#respond</comments>
		
		<dc:creator><![CDATA[Ramon Domingos]]></dc:creator>
		<pubDate>Mon, 16 Oct 2023 17:13:38 +0000</pubDate>
				<category><![CDATA[Aprendizagem de máquina]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Scikit-Learn]]></category>
		<guid isPermaLink="false">https://ramondomingos.com.br/?p=191</guid>

					<description><![CDATA[<p>O infarto do miocárdio, ou ataque cardíaco, é a morte das células de uma região do músculo do coração por conta da formação de um coágulo que interrompe o fluxo sanguíneo de forma súbita e intensa. Fonte: ALVES, B. / O. / O.-M. Ataque cardíaco (infarto) &#124; Biblioteca Virtual em Saúde MS. Disponível em:&#160;https://bvsms.saude.gov.br/ataque-cardiaco-infarto/#:~:text=O%20infarto%20do%20mioc%C3%A1rdio%2C%20ou. Prever&#8230;</p>
<p>O post <a href="https://ramondomingos.com.br/aplicando-machine-learning-no-dataset-sobre-doencas-cardiacas/">Aplicando Machine Learning no dataset sobre Doenças cardíacas</a> apareceu primeiro em <a href="https://ramondomingos.com.br">Ramon Domingos Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow">
<p>O infarto do miocárdio, ou ataque cardíaco, é a morte das células de uma região do músculo do coração por conta da formação de um coágulo que interrompe o fluxo sanguíneo de forma súbita e intensa.</p>



<p>Fonte: ALVES, B. / O. / O.-M. Ataque cardíaco (infarto) | Biblioteca Virtual em Saúde MS. Disponível em:&nbsp;<a href="https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fbvsms.saude.gov.br%2Fataque-cardiaco-infarto%2F%23%3A%7E%3Atext%3DO%2520infarto%2520do%2520mioc%25C3%25A1rdio%252C%2520ou" target="_blank" rel="noreferrer noopener">https://bvsms.saude.gov.br/ataque-cardiaco-infarto/#:~:text=O%20infarto%20do%20mioc%C3%A1rdio%2C%20ou</a>.</p>
</blockquote>



<p>Prever uma possível doença cardíaca com base no histórico dos pacientes é ajudar a pessoa se cuidar antes de ter um sintoma, ou adoecer com sequelas. Analisar dados de saúde é uma ação bastante delicada, não podemos expor os pacientes de nenhuma forma, além de algumas vezes ser preciso um especialista para ajudar essa interpretação de forma mais eficaz.</p>



<p>Como de costume, os exemplos desse post estão no <a href="https://drive.google.com/file/d/1sBJ6w-Sege6ryUUm2swsyifOUn5BiFVM/view?usp=sharing">colab</a>.</p>



<p>Nesse post iremos realizar o treinamento com os algoritmos:  <strong>Support Vector Machine<br>(SVM), Random Forest (RF), Logistic Regress (LR), K-Nearest Neighbor (KNN), Decision Tree (DT)</strong>. Alguns algoritmos foram executados com diferentes parâmetros para chegar em uma configuração com uma boa acurácia.</p>



<h2 class="wp-block-heading">Sobre o dataset</h2>



<p>A base de dados que vamos usar nessa abordagem esta disponível em:  https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset com os seguintes dados:</p>



<figure class="wp-block-table"><table><tbody><tr><td><strong>Coluna</strong></td><td><strong>Descrição</strong></td><td><strong>Valores</strong></td></tr><tr><td>Age</td><td>Idade</td><td>22 a 77 anos.</td></tr><tr><td>Sex</td><td>Sexo</td><td>1: masculino 0: feminino</td></tr><tr><td>cp</td><td>Tipo de dor no peito.</td><td>1 a 4</td></tr><tr><td>trestbps</td><td>Pressão arterial em mm Hg na admissão ao hospital.</td><td>94 a 200</td></tr><tr><td>chol</td><td>Colesterol em mg/dl.</td><td>126 a 564</td></tr><tr><td>fbs</td><td>Glicemia em jejum maior que 120 mg/dl.</td><td>1: verdadeiro  0: falso</td></tr><tr><td>retecg</td><td>Resultados eletrocardiográfico em repouso.</td><td>0 a 2</td></tr><tr><td>thalach</td><td>Frequência cardíaca máxima alcançada.</td><td>71 a 202</td></tr><tr><td>exang</td><td>Angina induzida por exercício.</td><td>1:sim. 0:não</td></tr><tr><td>oldpeak</td><td>Depressão do segmento ST induzida por exercício em relação ao repouso.</td><td>0 a 6.2</td></tr><tr><td>slope</td><td>A inclinação do pico do segmento ST do exercício.</td><td>1 a 3</td></tr><tr><td>ca</td><td>Número de vasos principais coloridos por fluoroscopia.</td><td>0 a 3</td></tr><tr><td>thal</td><td>Dor no peito ou dificuldade para respirar.</td><td>1: normal<br>2: fixo<br>3: reversível</td></tr><tr><td>target</td><td>Indicador se possui ou não doença cardíaca 1</td><td>1: sim 0: não</td></tr></tbody></table></figure>



<h2 class="wp-block-heading">Pré processamento</h2>



<p><strong>Removendo duplicados</strong></p>



<p>Existem 1025 instâncias nesse dataset, após usar a lib <em>profile-report</em>  foi identificado várias instâncias repetidas. Instancias repetidas pode gerar um vício no algoritmo, ja que ele não irá predizer, e sim  replicar um dado visto anteriormente. Removido, usando a função do pandas <em>drop_duplicates()</em>. </p>



<p><strong>Removendo outliers</strong></p>



<p>Gerando uma visualização com bloxPlot, percebemos que existem outliers, e foi usado o Intervalo Interquartil para remove-los. Essa técnica foi comentado em outro post. Consulte <a href="https://ramondomingos.com.br/removendo-outliers-de-uma-base-de-dados/">aqui</a>.</p>



<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="1024" height="629" src="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-1024x629.png" alt="" class="wp-image-192" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-1024x629.png 1024w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-300x184.png 300w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-768x472.png 768w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-1536x944.png 1536w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image.png 1572w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading">Treinando os modelos</h2>



<p><strong>Base de testes:</strong></p>



<p>É muito importante separar a base em treino e teste. Para que um dado que esteja no treino, não esteja no teste. O scikit-learn, tem uma função que realiza isso:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="y = df[&quot;target&quot;]
X = df.drop('target',axis=1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state = 0)" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF">y </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> df</span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">]</span></span>
<span class="line"><span style="color: #D8DEE9FF">X </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> df</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">drop</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9">axis</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">1</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">X_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> X_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_test </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">train_test_split</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">test_size</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">0.20</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">random_state</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">0</span><span style="color: #ECEFF4">)</span></span></code></pre></div>



<p><strong>Decision Tree:</strong></p>



<p>Esse algoritmo ja foi mencionado em outro post ( consulte <a href="https://ramondomingos.com.br/aplicando-arvore-de-decisao-no-dataset-iris/">aqui</a> ). Basicamente, cada bifurcação  é uma decisão, e vão sendo feitas, chamadas de nó,  até chegar em uma folha, que é a decisão propriamente dita.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="rf = RandomForestClassifier(n_estimators=20, random_state=12,max_depth=5)
rf.fit(X_train,y_train)
rf_predicted = rf.predict(X_test)
rf_conf_matrix = confusion_matrix(y_test, rf_predicted)
rf_acc_score = accuracy_score(y_test, rf_predicted)
print(&quot;confussion matrix&quot;)
print(rf_conf_matrix)
print(&quot;\n&quot;)
print(&quot;Accuracy of Random Forest:&quot;,rf_acc_score*100,'%\n')
print(classification_report(y_test,rf_predicted))" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF">rf </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">RandomForestClassifier</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">n_estimators</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">20</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">random_state</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">12</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9">max_depth</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">5</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">rf</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">rf_predicted </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> rf</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">predict</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_test</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">rf_conf_matrix </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">confusion_matrix</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> rf_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">rf_acc_score </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">accuracy_score</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> rf_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">confussion matrix</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">rf_conf_matrix</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Accuracy of Random Forest:</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">rf_acc_score</span><span style="color: #81A1C1">*</span><span style="color: #B48EAD">100</span><span style="color: #ECEFF4">,</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">%</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #88C0D0">classification_report</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">rf_predicted</span><span style="color: #ECEFF4">))</span></span></code></pre></div>



<p>Accuracy of Random Forest: 84.78260869565217 %</p>



<p><strong>Random Forest<br></strong>Tem uma grande semelhança com o Decision Tree, a diferença é que de forma automatica, se realiza várias árvores, fazendo uma floresta. É uma ótima técnica quando se tem uma grande quantidade de dados e features.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="rf = RandomForestClassifier(n_estimators=20, random_state=12,max_depth=5)
rf.fit(X_train,y_train)
rf_predicted = rf.predict(X_test)
rf_conf_matrix = confusion_matrix(y_test, rf_predicted)
rf_acc_score = accuracy_score(y_test, rf_predicted)
print(&quot;confussion matrix&quot;)
print(rf_conf_matrix)
print(&quot;\n&quot;)
print(&quot;Accuracy of Random Forest:&quot;,rf_acc_score*100,'%\n')
print(classification_report(y_test,rf_predicted))" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF">rf </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">RandomForestClassifier</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">n_estimators</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">20</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">random_state</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">12</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9">max_depth</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">5</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">rf</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">rf_predicted </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> rf</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">predict</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_test</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">rf_conf_matrix </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">confusion_matrix</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> rf_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">rf_acc_score </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">accuracy_score</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> rf_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">confussion matrix</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">rf_conf_matrix</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Accuracy of Random Forest:</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">rf_acc_score</span><span style="color: #81A1C1">*</span><span style="color: #B48EAD">100</span><span style="color: #ECEFF4">,</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">%</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #88C0D0">classification_report</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">rf_predicted</span><span style="color: #ECEFF4">))</span></span></code></pre></div>



<p>Accuracy of Random Forest: 84.78260869565217 %</p>



<p>Interessante ressaltar, que ficou com o mesmo valor que a decision tree.</p>



<p>Decidi então realizar variações nas árvores de decisões, principalmente no critério de classificação e na profundidade máxima.</p>



<p>Através de medições de quanto uma instancia pertence a uma classe, o <strong>gini</strong> faz suas decisões, ja o <strong>entropy</strong>, além disso observa também a desordem dos outros dados.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="k_range = range(1,11)
scores = {}

for k in k_range:
  dtFor = DecisionTreeClassifier(criterion = 'entropy',random_state=0,max_depth = k)
  dtFor.fit(X_train, y_train)
  y_pred = dtFor.predict(X_test)
  scores[k] = accuracy_score(y_test,y_pred)
plt.plot(k_range,list(scores.values()), label='entropy')
for k in k_range:
  dtFor = DecisionTreeClassifier(criterion = 'gini',random_state=0,max_depth = k)
  dtFor.fit(X_train, y_train)
  y_pred = dtFor.predict(X_test)
  scores[k] = accuracy_score(y_test,y_pred)
plt.plot(k_range,list(scores.values()), label='gini')
plt.xlabel('Profundidade da Árvore')
plt.ylabel('% de Acurácia')
plt.legend()" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF">k_range </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">range</span><span style="color: #ECEFF4">(</span><span style="color: #B48EAD">1</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">11</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">scores </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">{}</span></span>
<span class="line"></span>
<span class="line"><span style="color: #81A1C1">for</span><span style="color: #D8DEE9FF"> k </span><span style="color: #81A1C1">in</span><span style="color: #D8DEE9FF"> k_range</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9FF">  dtFor </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">DecisionTreeClassifier</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">criterion</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">entropy</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9">random_state</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">0</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9">max_depth</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> k</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">  dtFor</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">  y_pred </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> dtFor</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">predict</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_test</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">  scores</span><span style="color: #ECEFF4">[</span><span style="color: #D8DEE9FF">k</span><span style="color: #ECEFF4">]</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">accuracy_score</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">y_pred</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">k_range</span><span style="color: #ECEFF4">,</span><span style="color: #88C0D0">list</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">scores</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">values</span><span style="color: #ECEFF4">()),</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">label</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">entropy</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #81A1C1">for</span><span style="color: #D8DEE9FF"> k </span><span style="color: #81A1C1">in</span><span style="color: #D8DEE9FF"> k_range</span><span style="color: #ECEFF4">:</span></span>
<span class="line"><span style="color: #D8DEE9FF">  dtFor </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">DecisionTreeClassifier</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">criterion</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">gini</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9">random_state</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">0</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9">max_depth</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> k</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">  dtFor</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">  y_pred </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> dtFor</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">predict</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_test</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">  scores</span><span style="color: #ECEFF4">[</span><span style="color: #D8DEE9FF">k</span><span style="color: #ECEFF4">]</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">accuracy_score</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">y_pred</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">k_range</span><span style="color: #ECEFF4">,</span><span style="color: #88C0D0">list</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">scores</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">values</span><span style="color: #ECEFF4">()),</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">label</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">gini</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">xlabel</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Profundidade da Árvore</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">ylabel</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&#39;</span><span style="color: #EBCB8B">% d</span><span style="color: #A3BE8C">e Acurácia</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">legend</span><span style="color: #ECEFF4">()</span></span></code></pre></div>



<p>Conseguimos ver um gráfico, que inicia com uma ótima acurácia: </p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img decoding="async" width="576" height="432" src="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-1.png" alt="" class="wp-image-194" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-1.png 576w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-1-300x225.png 300w" sizes="(max-width: 576px) 100vw, 576px" /></figure>
</div>


<p>Quando exibimos a árvore visual com apenas 1 nível de profundidade, percebemos que só se observa a feature <strong>thal</strong>, que é a referente a dor no peito, algo muito previsível, provavelmente quem vai ao hospital, a chance de possuir alguma dor, é bastante alta, o ideal era observar outras features.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><img decoding="async" width="515" height="389" src="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-2.png" alt="" class="wp-image-195" style="aspect-ratio:1.3239074550128536;width:479px;height:auto" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-2.png 515w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-2-300x227.png 300w" sizes="(max-width: 515px) 100vw, 515px" /></figure>
</div>


<p></p>



<p>O segundo valor com uma boa acurácia, é o 3 profundidades, e ao plotar de forma visual, percebemos que existem outras observações.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="515" height="389" src="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-3.png" alt="" class="wp-image-196" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-3.png 515w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-3-300x227.png 300w" sizes="(max-width: 515px) 100vw, 515px" /></figure>
</div>


<p><strong>K-NeighborsClassifier</strong></p>



<p>Esse algoritmo analisa os vizinhos para tomar sua decisão e agrupar os dados. Possui algumas métricas, e podemos varias a quantidade de vizinhos analisados. No estudo foi usado euclidean e Manhattan, varias de 1 a 4 vizinhos, obtendo os seguintes níveis de acurácia.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><img loading="lazy" decoding="async" width="576" height="432" src="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-4.png" alt="" class="wp-image-198" style="aspect-ratio:1.3333333333333333;width:376px;height:auto" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-4.png 576w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-4-300x225.png 300w" sizes="(max-width: 576px) 100vw, 576px" /></figure>
</div>


<p>Então, usando 3 vizinhos e métrica manhattan, obtemos 71% de acurácia.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="knn = KNeighborsClassifier(n_neighbors=3, metric='manhattan')
knn.fit(X_train, y_train)
knn_predicted = knn.predict(X_test)
knn_conf_matrix = confusion_matrix(y_test, knn_predicted)
knn_acc_score_1_neighbors = accuracy_score(y_test, knn_predicted)
print(&quot;confussion matrix&quot;)
print(knn_conf_matrix)
print(&quot;\n&quot;)
print(&quot;Accuracy of K-NeighborsClassifier:&quot;,knn_acc_score_1_neighbors*100,'%\n')
print(classification_report(y_test,knn_predicted))" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF">knn </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">KNeighborsClassifier</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">n_neighbors</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">3</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">metric</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">manhattan</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">knn</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">knn_predicted </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> knn</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">predict</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_test</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">knn_conf_matrix </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">confusion_matrix</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> knn_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">knn_acc_score_1_neighbors </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">accuracy_score</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> knn_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">confussion matrix</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">knn_conf_matrix</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Accuracy of K-NeighborsClassifier:</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">knn_acc_score_1_neighbors</span><span style="color: #81A1C1">*</span><span style="color: #B48EAD">100</span><span style="color: #ECEFF4">,</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">%</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #88C0D0">classification_report</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">knn_predicted</span><span style="color: #ECEFF4">))</span></span></code></pre></div>



<p>Accuracy of K-NeighborsClassifier: 71.73913043478261 %</p>



<p><strong>Support Vector Classifier</strong></p>



<p></p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="svc =  SVC(kernel='rbf', C=2)
svc.fit(X_train, y_train)
svc_predicted = svc.predict(X_test)
svc_conf_matrix = confusion_matrix(y_test, svc_predicted)
svc_acc_score = accuracy_score(y_test, svc_predicted)
print(&quot;confussion matrix&quot;)
print(svc_conf_matrix)
print(&quot;\n&quot;)
print(&quot;Accuracy of Support Vector Classifier:&quot;,svc_acc_score*100,'%\n')
print(classification_report(y_test,svc_predicted))" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF">svc </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF">  </span><span style="color: #88C0D0">SVC</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">kernel</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">rbf</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">C</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">2</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">svc</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">svc_predicted </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> svc</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">predict</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_test</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">svc_conf_matrix </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">confusion_matrix</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> svc_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">svc_acc_score </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">accuracy_score</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> svc_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">confussion matrix</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">svc_conf_matrix</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Accuracy of Support Vector Classifier:</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">svc_acc_score</span><span style="color: #81A1C1">*</span><span style="color: #B48EAD">100</span><span style="color: #ECEFF4">,</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">%</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #88C0D0">classification_report</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">svc_predicted</span><span style="color: #ECEFF4">))</span></span></code></pre></div>



<p>Accuracy of Support Vector Classifier: 71.73913043478261 %</p>



<p><strong>Logistic Regression</strong></p>



<p></p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="from sklearn.linear_model import  LogisticRegression
reg = LogisticRegression( )
reg.fit(X_train, y_train)
reg_predicted = reg.predict(X_test)
reg_conf_matrix = confusion_matrix(y_test, reg_predicted)
reg_acc_score = accuracy_score(y_test, reg_predicted)
print(&quot;confussion matrix&quot;)
print(reg_conf_matrix)
print(&quot;\n&quot;)
print(&quot;Accuracy of Support Vector Classifier:&quot;,reg_acc_score*100,'%\n')
print(classification_report(y_test,reg_predicted))" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> sklearn</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">linear_model </span><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF">  LogisticRegression</span></span>
<span class="line"><span style="color: #D8DEE9FF">reg </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">LogisticRegression</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">reg</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">reg_predicted </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> reg</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">predict</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">X_test</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">reg_conf_matrix </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">confusion_matrix</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> reg_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">reg_acc_score </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">accuracy_score</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> reg_predicted</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">confussion matrix</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">reg_conf_matrix</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">Accuracy of Support Vector Classifier:</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">reg_acc_score</span><span style="color: #81A1C1">*</span><span style="color: #B48EAD">100</span><span style="color: #ECEFF4">,</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">%</span><span style="color: #EBCB8B">\n</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #88C0D0">print</span><span style="color: #ECEFF4">(</span><span style="color: #88C0D0">classification_report</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">y_test</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF">reg_predicted</span><span style="color: #ECEFF4">))</span></span></code></pre></div>



<p>Accuracy of Support Vector Classifier: 91.30434782608695 %</p>



<h2 class="wp-block-heading">Comparação dos resultados</h2>



<p>Random Forest 84.7826091%</p>



<p>K-Nearest Neighbour (10) 60.8695652%</p>



<p>K-Nearest Neighbour (3) 71.7391303%</p>



<p>Decision Tree 84.7826094%</p>



<p>Support Vector Machine 71.7391305%</p>



<p>Logistic Regression 91.304348%</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="363" src="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-5-1024x363.png" alt="" class="wp-image-199" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/10/image-5-1024x363.png 1024w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-5-300x106.png 300w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-5-768x272.png 768w, https://ramondomingos.com.br/wp-content/uploads/2023/10/image-5.png 1209w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>Pela característica do problema, o modelo de regressão logistica tem um resultado melhor.</p>



<h2 class="wp-block-heading">Observações sobre o estudo:</h2>



<p>Esse trabalho foi apresentado na disciplina de Aprendizagem de máquina e produzido artigo. Junto do meu colega <a href="https://www.linkedin.com/in/gerfesson/">Gerfesson</a>. Obtivemos nota máxima.</p>



<p>Usamos também com referência diversos outros estudos, mas o principal foi esse, e fica a recomendação de leitura: </p>



<p>K. Rashid, M. A. Islam, R. A. Tanzin, M. L. Labib, and M. Khan, “Heart disease pre- diction using interquartile range preprocessing and hypertuned machine learning,” in <em>2022 4th International Conference on Inventive Research in Computing Applications (ICIRCA)</em>, IEEE, Sept. 2022.</p>
<p>O post <a href="https://ramondomingos.com.br/aplicando-machine-learning-no-dataset-sobre-doencas-cardiacas/">Aplicando Machine Learning no dataset sobre Doenças cardíacas</a> apareceu primeiro em <a href="https://ramondomingos.com.br">Ramon Domingos Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://ramondomingos.com.br/aplicando-machine-learning-no-dataset-sobre-doencas-cardiacas/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Aplicando Árvore de decisão no dataset Íris</title>
		<link>https://ramondomingos.com.br/aplicando-arvore-de-decisao-no-dataset-iris/</link>
					<comments>https://ramondomingos.com.br/aplicando-arvore-de-decisao-no-dataset-iris/#comments</comments>
		
		<dc:creator><![CDATA[Ramon Domingos]]></dc:creator>
		<pubDate>Wed, 06 Sep 2023 21:41:31 +0000</pubDate>
				<category><![CDATA[Aprendizagem de máquina]]></category>
		<category><![CDATA[deep learning]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Scikit-Learn]]></category>
		<guid isPermaLink="false">https://ramondomingos.com.br/?p=179</guid>

					<description><![CDATA[<p>gráfico de decisões</p>
<p>O post <a href="https://ramondomingos.com.br/aplicando-arvore-de-decisao-no-dataset-iris/">Aplicando Árvore de decisão no dataset Íris</a> apareceu primeiro em <a href="https://ramondomingos.com.br">Ramon Domingos Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>No <a href="https://ramondomingos.com.br/conceito-da-arvore-de-decisao-aprendizado-de-maquina/">post anterior </a>vimos uma aplicação simples do algoritmo Árvore de decisão, para entender se iríamos ou não para universidade em um determinado dia. O nosso treino, possuía poucas linhas, e no geral tínhamos poucas decisões para tomar, era apenas IR ou NÃO IR, mas, quando o nosso conjunto de possíveis decisões aumenta, a quantidade de dados que precisamos para validar nosso modelo também tende a aumentar.</p>



<p>Como de costume, todo os exemplos estão no <a href="https://drive.google.com/file/d/1E76nyf4BcAuUy2NNbWOPdPvj6T5IPSox/view?usp=sharing">colab</a>. </p>



<p>Vamos iniciar importando as nossas bibliotecas, iniciando nosso Toy Dataset Iris e transformando num dataframe do pandas.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="import pandas as pd
from sklearn.datasets import load_iris
data = load_iris()
iris = pd.DataFrame(data.data)
iris.columns = data.feature_names
iris['target'] = data.target
iris.head()" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> pandas </span><span style="color: #81A1C1">as</span><span style="color: #D8DEE9FF"> pd</span></span>
<span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> sklearn</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">datasets </span><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> load_iris</span></span>
<span class="line"><span style="color: #D8DEE9FF">data </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">load_iris</span><span style="color: #ECEFF4">()</span></span>
<span class="line"><span style="color: #D8DEE9FF">iris </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> pd</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">DataFrame</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">data</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">data</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">iris</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">columns </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> data</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">feature_names</span></span>
<span class="line"><span style="color: #D8DEE9FF">iris</span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">]</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> data</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">target</span></span>
<span class="line"><span style="color: #D8DEE9FF">iris</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">head</span><span style="color: #ECEFF4">()</span></span></code></pre></div>



<p>Para ser mais didático, e melhorar a compreensão, vamos iniciar o nosso estudo, apenas com <strong>2 features</strong> referente a pétalas, para conseguirmos visualizar em um plano cartesiano. Em seguida adicionamos todos os campos.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="irisCopy = iris.loc[iris.target.isin([1,2]), ['petal length (cm)','petal width (cm)' , 'target']]
# separa em x e y
x = irisCopy.drop( 'target', axis=1)
y = irisCopy.target" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF">irisCopy </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> iris</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">loc</span><span style="color: #ECEFF4">[</span><span style="color: #D8DEE9FF">iris</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">target</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">isin</span><span style="color: #ECEFF4">([</span><span style="color: #B48EAD">1</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">2</span><span style="color: #ECEFF4">]),</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">petal length (cm)</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">petal width (cm)</span><span style="color: #ECEFF4">&#39;</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">]]</span></span>
<span class="line"><span style="color: #616E88"># separa em x e y</span></span>
<span class="line"><span style="color: #D8DEE9FF">x </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> irisCopy</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">drop</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">axis</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">1</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">y </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> irisCopy</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">target</span></span></code></pre></div>



<p>Como temos uma dataset bem grande, conseguimos dividi-lo em duas base, treino e teste. Vamos fazer isso usando o `train_test_split`.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="from sklearn.model_selection import train_test_split
x_train, x_teste, y_train, y_test = train_test_split( x, y , test_size=0.30, random_state=22)" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> sklearn</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">model_selection </span><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> train_test_split</span></span>
<span class="line"><span style="color: #D8DEE9FF">x_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> x_teste</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_test </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">train_test_split</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF"> x</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y </span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">test_size</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">0.30</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">random_state</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">22</span><span style="color: #ECEFF4">)</span></span></code></pre></div>



<p>Temos nossa base de teste e treino, agora vamos criar nosso classificador, usando nossa base de treino.<img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f3cb-1f3fd.png" alt="🏋🏽" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="from sklearn import tree
import matplotlib.pyplot as plt

clf =  tree.DecisionTreeClassifier(random_state=22)
clf = clf.fit(x_train, y_train)
fig, ax = plt.subplots(figsize=(10,8))

tree.plot_tree(clf)
plt.show()" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> sklearn </span><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> tree</span></span>
<span class="line"><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> matplotlib</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">pyplot </span><span style="color: #81A1C1">as</span><span style="color: #D8DEE9FF"> plt</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">clf </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF">  tree</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">DecisionTreeClassifier</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">random_state</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">22</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">clf </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> clf</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">x_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">fig</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> ax </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">subplots</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">figsize</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">(</span><span style="color: #B48EAD">10</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">8</span><span style="color: #ECEFF4">))</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">tree</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot_tree</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">clf</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">show</span><span style="color: #ECEFF4">()</span></span></code></pre></div>



<p>Obtemos essa árvore:</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="794" src="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-8-1024x794.png" alt="árvore de decisoes" class="wp-image-182" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-8-1024x794.png 1024w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-8-300x233.png 300w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-8-768x596.png 768w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-8.png 1338w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>Agora, vamos analisar cada nó, as decisões que estao sendo analisadas, e baseado nisso, vamos traças linhas em um gráfico, para identificar como estão sendo feito cada decisão:</p>



<ul class="wp-block-list">
<li>x[0] &lt; 4.75</li>



<li>x[0] &lt; 5.05</li>



<li>x[1] &lt; 1.65 ( nesse caso x[1], é o Y )</li>



<li>x[1] &lt; 1.6</li>



<li>x[0] &lt; 4.85</li>
</ul>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="
fig, ax = plt.subplots()
ax.scatter(
    x_train['petal length (cm)'],
    x_train['petal width (cm)'],
    c=y_train
)

ax.plot([4.75,4.75], [0,3], '--r') # primeiro nó
ax.plot([2,4.75],[1.65,1.65], '--r') # segundo nó
ax.plot([5.05,5.05], [3,0], '--r') # terceiro nó
ax.plot([4.75,5.05],[1.6,1.6], '--r') # quarto nó
ax.plot([4.75,5.05],[1.75,1.75], '--r') # quinto nó
ax.plot([4.85,4.85], [1.75,3], '--r') # sexto nó

ax.set( xlim=(3, 7), xticks=[2,3,4,5,6,7], ylim=(0.9,2.7), yticks=[1,1.5,2,2.5])
plt.show()" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">fig</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> ax </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">subplots</span><span style="color: #ECEFF4">()</span></span>
<span class="line"><span style="color: #D8DEE9FF">ax</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">scatter</span><span style="color: #ECEFF4">(</span></span>
<span class="line"><span style="color: #D8DEE9FF">    x_train</span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">petal length (cm)</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">],</span></span>
<span class="line"><span style="color: #D8DEE9FF">    x_train</span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">petal width (cm)</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">],</span></span>
<span class="line"><span style="color: #D8DEE9FF">    </span><span style="color: #D8DEE9">c</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF">y_train</span></span>
<span class="line"><span style="color: #ECEFF4">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">ax</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot</span><span style="color: #ECEFF4">([</span><span style="color: #B48EAD">4.75</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">4.75</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">[</span><span style="color: #B48EAD">0</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">3</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">--r</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span><span style="color: #D8DEE9FF"> </span><span style="color: #616E88"># primeiro nó</span></span>
<span class="line"><span style="color: #D8DEE9FF">ax</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot</span><span style="color: #ECEFF4">([</span><span style="color: #B48EAD">2</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">4.75</span><span style="color: #ECEFF4">],[</span><span style="color: #B48EAD">1.65</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">1.65</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">--r</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span><span style="color: #D8DEE9FF"> </span><span style="color: #616E88"># segundo nó</span></span>
<span class="line"><span style="color: #D8DEE9FF">ax</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot</span><span style="color: #ECEFF4">([</span><span style="color: #B48EAD">5.05</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">5.05</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">[</span><span style="color: #B48EAD">3</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">0</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">--r</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span><span style="color: #D8DEE9FF"> </span><span style="color: #616E88"># terceiro nó</span></span>
<span class="line"><span style="color: #D8DEE9FF">ax</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot</span><span style="color: #ECEFF4">([</span><span style="color: #B48EAD">4.75</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">5.05</span><span style="color: #ECEFF4">],[</span><span style="color: #B48EAD">1.6</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">1.6</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">--r</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span><span style="color: #D8DEE9FF"> </span><span style="color: #616E88"># quarto nó</span></span>
<span class="line"><span style="color: #D8DEE9FF">ax</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot</span><span style="color: #ECEFF4">([</span><span style="color: #B48EAD">4.75</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">5.05</span><span style="color: #ECEFF4">],[</span><span style="color: #B48EAD">1.75</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">1.75</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">--r</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span><span style="color: #D8DEE9FF"> </span><span style="color: #616E88"># quinto nó</span></span>
<span class="line"><span style="color: #D8DEE9FF">ax</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot</span><span style="color: #ECEFF4">([</span><span style="color: #B48EAD">4.85</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">4.85</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">[</span><span style="color: #B48EAD">1.75</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">3</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">--r</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">)</span><span style="color: #D8DEE9FF"> </span><span style="color: #616E88"># sexto nó</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">ax</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">set</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">xlim</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">(</span><span style="color: #B48EAD">3</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #B48EAD">7</span><span style="color: #ECEFF4">),</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">xticks</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">[</span><span style="color: #B48EAD">2</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">3</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">4</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">5</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">6</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">7</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">ylim</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">(</span><span style="color: #B48EAD">0.9</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">2.7</span><span style="color: #ECEFF4">),</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">yticks</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">[</span><span style="color: #B48EAD">1</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">1.5</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">2</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">2.5</span><span style="color: #ECEFF4">])</span></span>
<span class="line"><span style="color: #D8DEE9FF">plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">show</span><span style="color: #ECEFF4">()</span></span></code></pre></div>



<p>Conseguimos ver as seguintes linhas:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="904" height="704" src="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-9.png" alt="Grafico de decisões" class="wp-image-183" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-9.png 904w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-9-300x234.png 300w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-9-768x598.png 768w" sizes="(max-width: 904px) 100vw, 904px" /></figure>



<p>Dessa forma, podemos ver quais decisões foram tomadas pelo software. Agora, podemos evoluir, deixar de ser apenas 2 escolhas, e colocar para o algoritmo treinar todas as escolhas possíveis, ver a árvore ainda maior.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="
x_train, x_teste, y_train, y_test = train_test_split( iris.drop( 'target', axis=1), iris.target , test_size=0.20, random_state=10)

clf2 =  tree.DecisionTreeClassifier(random_state=22).fit(x_train, y_train)

fig, ax = plt.subplots(figsize=(10,8))

tree.plot_tree(clf2)
plt.show()" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">x_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> x_teste</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_test </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> </span><span style="color: #88C0D0">train_test_split</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF"> iris</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">drop</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">axis</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">1</span><span style="color: #ECEFF4">),</span><span style="color: #D8DEE9FF"> iris</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">target </span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">test_size</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">0.20</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">random_state</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">10</span><span style="color: #ECEFF4">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">clf2 </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF">  tree</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">DecisionTreeClassifier</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">random_state</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">22</span><span style="color: #ECEFF4">).</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">x_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">fig</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> ax </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">subplots</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9">figsize</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">(</span><span style="color: #B48EAD">10</span><span style="color: #ECEFF4">,</span><span style="color: #B48EAD">8</span><span style="color: #ECEFF4">))</span></span>
<span class="line"></span>
<span class="line"><span style="color: #D8DEE9FF">tree</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot_tree</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">clf2</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">plt</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">show</span><span style="color: #ECEFF4">()</span></span></code></pre></div>



<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" src="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-7-1024x799.png" alt="árvore de decisões" class="wp-image-181" style="width:840px;height:656px" width="840" height="656" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-7-1024x799.png 1024w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-7-300x234.png 300w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-7-768x599.png 768w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-7.png 1302w" sizes="(max-width: 840px) 100vw, 840px" /></figure>



<p>Agora, vamos avaliar nosso modelo, qual o score que ele possui:</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="clf2.score(x_train, y_train)
# 1" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #D8DEE9FF">clf2</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">score</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">x_train</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> y_train</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #616E88"># 1</span></span></code></pre></div>



<p>Um excelente aprendizado, nota máxima. Mas essa não é a única maneira de se avaliar um modelo. Existem outras métricas, que veremos em outro post.</p>
<p>O post <a href="https://ramondomingos.com.br/aplicando-arvore-de-decisao-no-dataset-iris/">Aplicando Árvore de decisão no dataset Íris</a> apareceu primeiro em <a href="https://ramondomingos.com.br">Ramon Domingos Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://ramondomingos.com.br/aplicando-arvore-de-decisao-no-dataset-iris/feed/</wfw:commentRss>
			<slash:comments>1</slash:comments>
		
		
			</item>
		<item>
		<title>Conceito da Árvore de decisão &#8211; Aprendizado de máquina</title>
		<link>https://ramondomingos.com.br/conceito-da-arvore-de-decisao-aprendizado-de-maquina/</link>
					<comments>https://ramondomingos.com.br/conceito-da-arvore-de-decisao-aprendizado-de-maquina/#comments</comments>
		
		<dc:creator><![CDATA[Ramon Domingos]]></dc:creator>
		<pubDate>Wed, 06 Sep 2023 17:26:52 +0000</pubDate>
				<category><![CDATA[Aprendizagem de máquina]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Scikit-Learn]]></category>
		<guid isPermaLink="false">https://ramondomingos.com.br/?p=168</guid>

					<description><![CDATA[<p>O Algoritmo de árvore de decisão é bastante popular, e possui representações gráficas de como o algoritmo esta realizando as decisões. Muito bom para ajudar o entendimento das operações que ele realiza, e prever possíveis falhas, em casos mais críticos. Dessa forma, adicionando mais cenários desse tipo para o treinamento. Neste post vamos utilizar uma&#8230;</p>
<p>O post <a href="https://ramondomingos.com.br/conceito-da-arvore-de-decisao-aprendizado-de-maquina/">Conceito da Árvore de decisão &#8211; Aprendizado de máquina</a> apareceu primeiro em <a href="https://ramondomingos.com.br">Ramon Domingos Blog</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>O Algoritmo de árvore de decisão é bastante popular, e possui representações gráficas de como o algoritmo esta realizando as decisões. Muito bom para ajudar o entendimento das operações que ele realiza, e prever possíveis falhas, em casos mais críticos. Dessa forma, adicionando mais cenários desse tipo para o treinamento. </p>



<p>Neste post vamos utilizar uma situação simples, com poucos nós. Para entendermos como ele funciona, e em quais situações ele é uma boa escolha, no próximo post utilizaremos datasets maiores, com mais decisões, além de Sim/Não. </p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" width="1024" height="1024" src="https://ramondomingos.com.br/wp-content/uploads/2023/09/Arvore-de-decisao-1-1024x1024.png" alt="" class="wp-image-171" style="width:639px;height:639px" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/09/Arvore-de-decisao-1-1024x1024.png 1024w, https://ramondomingos.com.br/wp-content/uploads/2023/09/Arvore-de-decisao-1-300x300.png 300w, https://ramondomingos.com.br/wp-content/uploads/2023/09/Arvore-de-decisao-1-150x150.png 150w, https://ramondomingos.com.br/wp-content/uploads/2023/09/Arvore-de-decisao-1-768x768.png 768w, https://ramondomingos.com.br/wp-content/uploads/2023/09/Arvore-de-decisao-1.png 1080w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>No geral esse algoritmo busca classificar um registro ( problemas de classificação) ou  estimar um valor ( problemas de regressão). Como vemos nessa imagem , cada pergunta, chamadas de <strong>nó decisão</strong>, respondemos SIM ou NÃO, a primeira pergunta, o nó inicial é o <strong>nó raiz</strong> e o último, com a resposta, é o <strong>nó folha.</strong> Em inglês, Decision node, Chance node, Endpoint Node.</p>



<p>Mas como sair de uma simplesmente diagramação visual e chegar num modelo?</p>



<p>O <em><strong>sckit-learn </strong></em>faz esse treinamento, além de exibir uma representação visual das decisões como essa:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" width="1024" height="725" src="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-4-1024x725.png" alt="" class="wp-image-172" style="width:565px;height:400px" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-4-1024x725.png 1024w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-4-300x213.png 300w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-4-768x544.png 768w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-4-1536x1088.png 1536w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-4.png 1646w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Preparei um <a href="https://colab.research.google.com/drive/1D_qsU6QAtFJTeiKncr6UosOD2fsN_bdf?usp=sharing">colab</a> com esses exemplos que teremos nesse post.</p>



<p>Inicialmente, preparei um array, usando numPy, baseado nessa situação, e exibir a tabela</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="import pandas as pd
import numpy as np
# Criando um array de resultados
numpy_array = np.array([
[True,True,False,False,False], [False,False,False,False,False],
[True,False,True,False,True], [True,False,False,True,True], 
[True,False,False,False,False]])
# Convertendo em Pandas dataFrame
df = pd.DataFrame(numpy_array, columns=['Tenho aula?', 'É Remoto', 'Vou de Carro', 'Vou de ônibus', 'target'])
df[&quot;target&quot;] = df[&quot;target&quot;].astype(int)
df['target_names']= pd.Categorical.from_codes (df[&quot;target&quot;], ['Não vou', 'Vou'])
# Exibindo
df.head()" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> pandas </span><span style="color: #81A1C1">as</span><span style="color: #D8DEE9FF"> pd</span></span>
<span class="line"><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> numpy </span><span style="color: #81A1C1">as</span><span style="color: #D8DEE9FF"> np</span></span>
<span class="line"><span style="color: #616E88"># Criando um array de resultados</span></span>
<span class="line"><span style="color: #D8DEE9FF">numpy_array </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> np</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">array</span><span style="color: #ECEFF4">([</span></span>
<span class="line"><span style="color: #ECEFF4">[</span><span style="color: #81A1C1">True</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">True</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">[</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">],</span></span>
<span class="line"><span style="color: #ECEFF4">[</span><span style="color: #81A1C1">True</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">True</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">True</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">[</span><span style="color: #81A1C1">True</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">True</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">True</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span></span>
<span class="line"><span style="color: #ECEFF4">[</span><span style="color: #81A1C1">True</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">,</span><span style="color: #81A1C1">False</span><span style="color: #ECEFF4">]])</span></span>
<span class="line"><span style="color: #616E88"># Convertendo em Pandas dataFrame</span></span>
<span class="line"><span style="color: #D8DEE9FF">df </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> pd</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">DataFrame</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">numpy_array</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">columns</span><span style="color: #81A1C1">=</span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Tenho aula?</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">É Remoto</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Vou de Carro</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Vou de ônibus</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">])</span></span>
<span class="line"><span style="color: #D8DEE9FF">df</span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">]</span><span style="color: #D8DEE9FF"> </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> df</span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">].</span><span style="color: #88C0D0">astype</span><span style="color: #ECEFF4">(</span><span style="color: #88C0D0">int</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">df</span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">target_names</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">]</span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> pd</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">Categorical</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">from_codes</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">df</span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&quot;</span><span style="color: #A3BE8C">target</span><span style="color: #ECEFF4">&quot;</span><span style="color: #ECEFF4">],</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">[</span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Não vou</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> </span><span style="color: #ECEFF4">&#39;</span><span style="color: #A3BE8C">Vou</span><span style="color: #ECEFF4">&#39;</span><span style="color: #ECEFF4">])</span></span>
<span class="line"><span style="color: #616E88"># Exibindo</span></span>
<span class="line"><span style="color: #D8DEE9FF">df</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">head</span><span style="color: #ECEFF4">()</span></span></code></pre></div>



<p>Ficou assim:</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="328" src="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-5-1024x328.png" alt="" class="wp-image-175" srcset="https://ramondomingos.com.br/wp-content/uploads/2023/09/image-5-1024x328.png 1024w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-5-300x96.png 300w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-5-768x246.png 768w, https://ramondomingos.com.br/wp-content/uploads/2023/09/image-5.png 1316w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>Em seguida, usando o sckitLearn para  criar uma classificador, treinar o modelo e criar a árvore de decisão, em seguida apresento aquela representação gráfica. Mostrada inicialmente.</p>



<div class="wp-block-kevinbatdorf-code-block-pro" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="from sklearn import tree
clf = tree.DecisionTreeClassifier( random_state=42)
clf = clf.fit(dados, df.target)
tree.plot_tree(clf)" style="color:#d8dee9ff;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki nord" style="background-color: #2e3440ff" tabindex="0"><code><span class="line"><span style="color: #81A1C1">from</span><span style="color: #D8DEE9FF"> sklearn </span><span style="color: #81A1C1">import</span><span style="color: #D8DEE9FF"> tree</span></span>
<span class="line"><span style="color: #D8DEE9FF">clf </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> tree</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">DecisionTreeClassifier</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF"> </span><span style="color: #D8DEE9">random_state</span><span style="color: #81A1C1">=</span><span style="color: #B48EAD">42</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">clf </span><span style="color: #81A1C1">=</span><span style="color: #D8DEE9FF"> clf</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">fit</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">dados</span><span style="color: #ECEFF4">,</span><span style="color: #D8DEE9FF"> df</span><span style="color: #ECEFF4">.</span><span style="color: #D8DEE9FF">target</span><span style="color: #ECEFF4">)</span></span>
<span class="line"><span style="color: #D8DEE9FF">tree</span><span style="color: #ECEFF4">.</span><span style="color: #88C0D0">plot_tree</span><span style="color: #ECEFF4">(</span><span style="color: #D8DEE9FF">clf</span><span style="color: #ECEFF4">)</span></span></code></pre></div>



<p>No próximo post, vamos utilizar algum Toy dataset para esse algoritmo.</p>
<p>O post <a href="https://ramondomingos.com.br/conceito-da-arvore-de-decisao-aprendizado-de-maquina/">Conceito da Árvore de decisão &#8211; Aprendizado de máquina</a> apareceu primeiro em <a href="https://ramondomingos.com.br">Ramon Domingos Blog</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://ramondomingos.com.br/conceito-da-arvore-de-decisao-aprendizado-de-maquina/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
	</channel>
</rss>
