<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>vault Archives - Big Data Processing</title>
	<atom:link href="https://bigdataproc.com/tag/vault/feed/" rel="self" type="application/rss+xml" />
	<link>https://bigdataproc.com/tag/vault/</link>
	<description>Big Data Solution for GCP, AWS, Azure and on-prem</description>
	<lastBuildDate>Mon, 08 Jul 2024 14:07:12 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	
	<item>
		<title>GCP Cloud Composer &#8211; Configure Hashicorp Vault to store connection and variables</title>
		<link>https://bigdataproc.com/gcp-cloud-composer-configure-hashicorp-vault-to-store-connection-and-variables/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=gcp-cloud-composer-configure-hashicorp-vault-to-store-connection-and-variables</link>
					<comments>https://bigdataproc.com/gcp-cloud-composer-configure-hashicorp-vault-to-store-connection-and-variables/#respond</comments>
		
		<dc:creator><![CDATA[Gaurang]]></dc:creator>
		<pubDate>Mon, 15 Jul 2024 22:00:00 +0000</pubDate>
				<category><![CDATA[Airflow]]></category>
		<category><![CDATA[GCP]]></category>
		<category><![CDATA[airflow]]></category>
		<category><![CDATA[cloud composer]]></category>
		<category><![CDATA[gcp]]></category>
		<category><![CDATA[vault]]></category>
		<guid isPermaLink="false">https://bigdataproc.com/?p=494</guid>

					<description><![CDATA[<p>connection airflow to hashicorp vault to store airflow connection and variables. </p>
<div class="more-link-wrapper"><a class="more-link" href="https://bigdataproc.com/gcp-cloud-composer-configure-hashicorp-vault-to-store-connection-and-variables/">Continue reading<span class="screen-reader-text">GCP Cloud Composer &#8211; Configure Hashicorp Vault to store connection and variables</span></a></div>
<p>The post <a href="https://bigdataproc.com/gcp-cloud-composer-configure-hashicorp-vault-to-store-connection-and-variables/">GCP Cloud Composer &#8211; Configure Hashicorp Vault to store connection and variables</a> appeared first on <a href="https://bigdataproc.com">Big Data Processing </a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Airflow providers multiple secrets backends to be configured for storing airflow connection and variables.  for a long time we were using airflow backends however recently I migrated all the connection to vault and started using vault as our backend. In this post I will show you a step by step guide on how to do this. </p>



<h2 class="wp-block-heading">Configure HashiCorp Vault</h2>



<h3 class="wp-block-heading">create mount point</h3>



<p>we have multiple composer(airflow) environment and so strategy we have used is created a single mount point named airflow and then use the different path for different airflow instances. you could choose to have different strategy as per your organization standard and requirement.  Run the following command to create the secrets mount point. </p>



<pre class="EnlighterJSRAW" data-enlighter-language="bash" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">vault secrets enable -path=airflow -version=2 kv</pre>



<h3 class="wp-block-heading">Create role </h3>



<p>vault provides multiple ways to authenticate, what we are going to use is role. so let&#8217;s create a role.  please copy the secret id from the output and store it somewhere. this would be useful when we will vault connection in airflow. </p>



<pre class="EnlighterJSRAW" data-enlighter-language="bash" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="create approle" data-enlighter-group="">vault write auth/approle/role/gcp_composer_role \
    role_id=gcp_composer_role \
    secret_id_ttl=0 \
    secret_id_num_uses=0 \
    token_num_uses=0 \
    token_ttl=24h \
    token_max_ttl=24h \
    token_policies=gcp_composer_policy</pre>



<h3 class="wp-block-heading">Create Policy </h3>



<p>role need to be associated with a policy. policy is nothing but a grant (access).  Run the following code to create a policy which would give <code>read</code> and <code>list</code> permission to <code>airflow</code> path, we created earlier. </p>



<pre class="EnlighterJSRAW" data-enlighter-language="bash" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">vault policy write gcp_composer_policy - &lt;&lt;EOF
path "airflow/*" {
  capabilities = ["read", "list"]
}
EOF</pre>



<p>now are are all set in vault. let&#8217;s change the airflow configuration to start using vault. </p>



<h2 class="wp-block-heading">Configure Airflow (GCP Cloud Composer)</h2>



<p>Navigate to your airflow instances and override following two settings. </p>



<ul>
<li><strong>secrets.backend </strong>= <code>airflow.providers.hashicorp.secrets.vault.VaultBackend</code></li>



<li><strong>secrets.backend_kwargs </strong></li>
</ul>



<pre class="EnlighterJSRAW" data-enlighter-language="json" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{
"mount_point": "airflow", 
"connections_path": "dev-composer/connections" , 
"variables_path": null, 
"config_path": null, 
"url": "&lt;your_vault_url>", 
"auth_type": "approle", 
"role_id":"gcp_composer_role", 
"secret_id":"&lt;your_secret_id>"
}</pre>



<p><strong>connection_path</strong>: path where you would like to store your airflow connection. for me it&#8217;s my composer name and then connection. if you have single airflow you could just store everything under connection.  <br><strong>variables_path</strong>:  i have specified null as I am storing variables in airflow. if you want to store variables also in vault, just provide the path. <br><strong>config_path</strong>: same as variables, I am keeping config in airflow <br><strong>url</strong>: replace with you vault url <br><strong>auth_type</strong>:  we are using approle to authenticate with vault as discussed above <br><strong>role_id</strong>:  the role we created above. if you have used different name, please replace here. <br><strong>secret_id</strong>:  secret_id we generated for the role</p>



<h2 class="wp-block-heading">How to store connections</h2>



<p>for connection create a path with connection name and put the json with proper key and value for connection. for example, default bigquery connection.  it would look like this. </p>



<p><strong>mount point:</strong>  airflow<br><strong>Path</strong>: dev-composer/connections/bigquery_default</p>



<pre class="EnlighterJSRAW" data-enlighter-language="json" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{
  "conn_type": "google_cloud_platform",
  "description": "",
  "extra": "{\"extra__google_cloud_platform__project\": \"youre_project\", \"extra__google_cloud_platform__key_path\": \"\", \"extra__google_cloud_platform__key_secret_name\": \"\", \"extra__google_cloud_platform__keyfile_dict\": \"\", \"extra__google_cloud_platform__num_retries\": 5, \"extra__google_cloud_platform__scope\": \"\"}",
  "host": "",
  "login": "",
  "password": null,
  "port": null,
  "schema": ""
}</pre>



<h2 class="wp-block-heading">How to store variables</h2>



<p>for variables.  in the json key would alway be <code>value</code>. as shows below </p>



<p><strong>mount point</strong>: airflow <br><strong>path</strong>: dev-composer/variables/raw_project_name</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">{
  "value": "raw_project_id"
}</pre>
<p>The post <a href="https://bigdataproc.com/gcp-cloud-composer-configure-hashicorp-vault-to-store-connection-and-variables/">GCP Cloud Composer &#8211; Configure Hashicorp Vault to store connection and variables</a> appeared first on <a href="https://bigdataproc.com">Big Data Processing </a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://bigdataproc.com/gcp-cloud-composer-configure-hashicorp-vault-to-store-connection-and-variables/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
