Blog//tdi.github.io/2016-08-13T18:18:00+02:00Automatic PostgreSQL config with Ansible2016-08-13T18:18:00+02:00Dariusz Dwornikowskitag:tdi.github.io,2016-08-13:2016/08/13/automatic-postgresql-config-with-ansible/<p>If for some reasons you can’t use dedicated DBaaS for your PostgreSQL (like AWS RDS) then you need
to run your database server on a cloud instance. In these kind of setup, when you scale up or down
your instance size, you need to adjust PostgreSQL parameters according to the changing RAM size.
There are several parameters in PostgreSQL that highly depend on RAM size. An example is
<code>shared_buffers</code> for which a rule of thumb says that is should be set to 0.25*RAM.</p>
<p>In DBaaS, when you scale the DB instance up or down, parameters are adjusted for you by the cloud
provider, e.g. AWS RDS uses parameter groups for that reason, where particular parameters are
defined depending on the size of the RAM of the RDS instance.</p>
<p>So what can you when you do not have RDS or any other DBaaS? You can always keep several
configuration files on your instance, each for a different memory size, you can rewrite you config
every time you change the size of the instance… or you can use Ansible role for that.</p>
<p>Our Ansible role will be very simple, we will have two tasks. One will change the PostgreSQL config,
the second one will just restart the database server:</p>
<div class="highlight"><pre><span></span><span class="nn">---</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">name</span><span class="p p-Indicator">:</span> <span class="l l-Scalar l-Scalar-Plain">Update PostgreSQL config</span>
<span class="l l-Scalar l-Scalar-Plain">template</span><span class="p p-Indicator">:</span> <span class="l l-Scalar l-Scalar-Plain">src=postgresql.conf.j2 dest=/etc/postgresql/9.5/main/postgresql.conf</span>
<span class="l l-Scalar l-Scalar-Plain">register</span><span class="p p-Indicator">:</span> <span class="l l-Scalar l-Scalar-Plain">pgconf</span>
<span class="p p-Indicator">-</span> <span class="l l-Scalar l-Scalar-Plain">name</span><span class="p p-Indicator">:</span> <span class="l l-Scalar l-Scalar-Plain">Restart postgresql</span>
<span class="l l-Scalar l-Scalar-Plain">service</span><span class="p p-Indicator">:</span> <span class="l l-Scalar l-Scalar-Plain">name=postgresql state=restarted</span>
<span class="l l-Scalar l-Scalar-Plain">when</span><span class="p p-Indicator">:</span> <span class="l l-Scalar l-Scalar-Plain">pgconf.changed</span>
</pre></div>
<p>Now we need the template, where are the calculations take place. RAM size will be taken from the
Ansible’s fact called <code>ansible_memtotal_mb</code>. Since it returns RAM size in MBs, we will stick to MBs.
We will define the following parameters, you can adjust them to your needs:</p>
<ul>
<li><code>shared_buffers</code>, as 0.25*RAM size,</li>
<li><code>work_mem</code>, as <code>shared_buffers/max_connections</code>,</li>
<li><code>maintenance_work_mem</code>, as RAM GBs times 64MB,</li>
<li><code>effective_cache_size</code>, as 0.75*RAM size.</li>
</ul>
<p>For max_connections we will define a default role variable of 100 but we will allow to specify it at
a runtime. The relevant parts of the <code>postgresql.conf.j2</code> are below:</p>
<div class="highlight"><pre><span></span><span class="x"> max_connections = </span><span class="cp">{{</span> <span class="nv">max_connections</span> <span class="cp">}}</span><span class="x"> </span>
<span class="x"> shared_buffers = </span><span class="cp">{{</span> <span class="o">(((</span><span class="nv">ansible_memtotal_mb</span><span class="o">/</span><span class="m">1024.0</span><span class="o">)|</span><span class="nf">round</span><span class="o">|</span><span class="nf">int</span><span class="o">)*</span><span class="m">0.25</span><span class="o">)|</span><span class="nf">int</span><span class="o">*</span><span class="m">1024</span> <span class="cp">}}</span><span class="x">MB</span>
<span class="x"> work_mem = </span><span class="cp">{{</span> <span class="o">((((</span><span class="nv">ansible_memtotal_mb</span><span class="o">/</span><span class="m">1024.0</span><span class="o">)|</span><span class="nf">round</span><span class="o">|</span><span class="nf">int</span><span class="o">)*</span><span class="m">0.25</span><span class="o">)/</span><span class="nv">max_connections</span><span class="o">*</span><span class="m">1024</span><span class="o">)|</span><span class="nf">round</span><span class="o">|</span><span class="nf">int</span> <span class="cp">}}</span><span class="x">MB</span>
<span class="x"> maintenance_work_mem = </span><span class="cp">{{</span> <span class="o">((</span><span class="nv">ansible_memtotal_mb</span><span class="o">/</span><span class="m">1024.0</span><span class="o">)|</span><span class="nf">round</span><span class="o">|</span><span class="nf">int</span><span class="o">)*</span><span class="m">64</span> <span class="cp">}}</span><span class="x">MB</span>
<span class="x"> effective_cache_size = </span><span class="cp">{{</span> <span class="o">(((</span><span class="nv">ansible_memtotal_mb</span><span class="o">/</span><span class="m">1024.0</span><span class="o">)|</span><span class="nf">round</span><span class="o">|</span><span class="nf">int</span><span class="o">)*</span><span class="m">0.75</span><span class="o">)|</span><span class="nf">int</span><span class="o">*</span><span class="m">1024</span> <span class="cp">}}</span><span class="x">MB</span>
</pre></div>
<p>You can now run the role every time you change the instance size, and the config will be changed
accordingly to the RAM size. You can extend the role and maybe add other constraints and change
<code>max_connections</code> to you specific needs. An example playbook could look like:</p>
<div class="highlight"><pre><span></span>---
hosts: my_postgres
roles:
- postgres-config
vars:
- max_connection: 300
</pre></div>
<p>And run it:</p>
<div class="highlight"><pre><span></span>$ ansible-playbook playbook.yml
</pre></div>
<p>The complete role can be found in my <a href="https://github.com/tdi/postgres-config">github repo</a>.</p>HAProxy and 503 HTTP errors with AWS ELB as a backend2016-04-19T14:26:00+02:00Dariusz Dwornikowskitag:tdi.github.io,2016-04-19:2016/04/19/haproxy-and-503-http-errors-with-aws-elb-as-a-backend/<p>Although, AWS provides load balancer service in the form of Elastic Load Balancer (ELB), a common
trick is to use HAProxy in the middle to provide SSL offloading, complex routing and better logging. <br />
In this scenario, a public ELB is the frontier of all the traffic, HAProxy farm in the middle is
managed by an Auto Scaling Group, and one (or more) internal backend ELBs stay in front of Web farm. </p>
<p><img alt="haproxy" src="//tdi.github.io/images/haproxy.png" /></p>
<p>I think that <a href="http://www.haproxy.org/">HAProxy</a> does not need any introductions here. It is highly
scalable and reliable piece of software. There is however a small caveat when you use it with domain
names and not IP addresses. To speed up things, HAProxy resolves all the domain named during startup (during config file parsing in fact). Hence, when
the IP of a domain changes, you end up with a lot of 503s (Service Unavailable). </p>
<p>Why is this important ? In AWS, ELB's IP can change over time, so it is recommended to use ELB's domain name.
Now, when you use this domain name in HAProxy's backend, you can end up with 503s. ELB IPs do not
change so often but still you would not want any downtimes. </p>
<p>The solution is to configure runtime resolvers in HAProxy and use them in the backend
<a href="http://blog.haproxy.com/2015/10/14/whats-new-in-haproxy-1-6/">(unforntunatelly this works only in HAProxy 1.6)</a>:</p>
<div class="highlight"><pre><span></span> :::haproxy
resolvers myresolver
nameserver dns1 10.10.10.10:53
resolve_retries 30
timeout retry 1s
hold valid 10s
backend mybackend
server myelb-internal.123456.eu-west-1.elb.amazonaws.com check resolvers myresolver
</pre></div>
<p>Now HAProxy will check the domain at runtime, no more 503s.</p>