Mon blog-notes à moi que j'ai

Blog personnel d'un sysadmin, tendance hacker

Compilation veille Twitter & RSS #2016-36

La moisson de liens pour la semaine du 5 au 9 septembre 2016. Ils ont, pour la plupart, été publiés sur mon compte Twitter. Les voici rassemblés pour ceux qui les auraient raté.

Bonne lecture

Security & Privacy

Considerations on DMZ Design in 2016, Part 2: A Quick Digression on Reverse Proxies
This is the second part of a series with considerations on DMZ networks in 2016 (part 1 can be found here). Beforehand I had planned to cover classification & segmentation approaches in this one, but after my little rant on how « the business » might approach & think about reverse proxies in the first part, I felt tempted to elaborate a bit further on this particular topic. I kindly ask for your patience 😉 and will digress a bit for the moment.
Keeping Android safe: Security enhancements in Nougat
Over the course of the summer, we previewed a variety of security enhancements in Android 7.0 Nougat: an increased focus on security with our vulnerability rewards program, a new Direct Boot mode, re-architected mediaserver and hardened media stack, apps that are protected from accidental regressions to cleartext traffic, an update to the way Android handles trusted certificate authorities, strict enforcement of verified boot with error correction, and updates to the Linux kernel to reduce the attack surface and increase memory protection. Phew!
Candy from Strangers
A few days ago I gave a talk at ESC about some reasons why I think that using software and especially libraries from the packages of a community managed distribution is important and much better than alternatives such as pypi, nmp etc. This article is a translation of what I planned to say before forgetting bits of it and luckily adding it back as an answer to a question :)

System Engineering

Using nginScript to Progressively Transition Clients to a New Server
With the launch of NGINX Plus R10, we introduced a preview release of our next‑generation programmatic configuration language, nginScript. nginScript is a unique JavaScript implementation for NGINX and NGINX Plus, designed specifically for server‑side use cases and per‑request processing.
Wait… is that how you are supposed to configure your SSD card?
I bought a laptop with only SSD drives a while ago and based on a limited amount of reading, added the « discard » option to my /etc/fstab file for all partitions and happily went on my way expecting to avoid the performance degradation problems that happen on SSD cards without this setting).
Deploying to Multiple Kubernetes Clusters with kit
Our Docker journey at InVision may sound familiar. We started with Docker in our development environments, trying to get consistency there first. We wrangled our legacy monolith application into Docker images and streamlined our Dockerfiles to minimize size and amp the efficiency. Things were looking good. Did we learn a lot along the way? For sure. But at the end of it all, we had our entire engineering team working with Docker locally for their development environments. Mission accomplished! Well, not quite. Development was one thing, but moving to production was a whole other ballgame.
How to write your first Lintian check
Lintian’s humble description of « Debian package checker » belies its importance within the Debian GNU/Linux project. An extensive static analysis tool, it’s not only used by the vast majority of developers, falling foul of some of its checks even cause uploads to be automatically rejected by the archive maintenance software.
testing build reprodubility with debrepro
Earlier today I was handling a reproducibility bug and decided I had to try a reproducibility test by myself. I tried reprotest, but I was being hit by a disorderfs issue and I was not sure whether the problem was with reprotest or not (at this point I cannot reproduce that anymore).


How We Monitor Elasticsearch with Graphite and Grafana
« How do you monitor Elasticsearch at this scale? » is a question we are asked again and again by ELK Stack users and our customers. Recognizing the challenge, we wanted to share some of our monitoring engineering with everyone!
Dominos, Botnets, and a little LSTM
Suppose you were to watch a stream of numbers… and given the previous number you saw you had to predict what the next number should be. For example, suppose you saw 1,2,1,2,… We might guess: 1. Or perhaps, what if you saw 0,0,1,2,3…? Should it be 4? It almost feels like a domino effect. In this post we walk through predicting a specific type of spike in domain queries associated with some botnets. These domains spike like falling dominos: 0,0,0,1,2,3….
Dockbeat: A new addition to the Beats Community
Did you ever want to know how your Docker containers behave over time? Did you ever want to have a Beat capable of reading Docker containers statistics and indexing them into Elasticsearch? Today, the solution is at your fingertips: Dockbeat!

Software Engineering

Improving the performance of full-text search
For Firefly, Dropbox’s full-text search engine, speed has always been a priority. (For more background on Firefly, check out our blog post). When our team saw search latency deteriorate from 250 ms to 1000 ms (95th percentile), we knew what to do—we measured, we analyzed, we fixed.
RESTFul API Versioning Insights
When it comes to API versioning there are so many best practices and insights but there is still not a rock solid best practice.
In order to understand the Restful API versioning we first need to understand the problem.
Building resilience in Spokes
Before we get into the topic at hand—building resilience—we have a new name to announce: DGit is now Spokes.
Earlier this year, we announced « DGit » or « Distributed Git, » our application-level replication system for Git. We got feedback that the name « DGit » wasn’t very distinct and could cause confusion with the Git project itself. So we have decided to rename the system Spokes.
API First Transformation at Etsy – Concurrency
At Etsy we have been doing some pioneering work with our Web APIs. We switched to API-first design, have experimented with concurrency handling in our composition layer, introduced strong typing into our API design, experimented with code generation, and built distributed tracing tools for API as part of this project.
What are the scale-able technologies Facebook uses?
Facebook is world’s second largest website with billions of users around the globe. In one month, it servers more than 570 billion page views and 3 billion photo uploads. In one second, Facebook servers more than 1.2 Million photos and it doesn’t include the images served by the Facebook’s CDN. In addition to that more than 25 billion pieces of content (status updates, comments, etc) are shared every month on facebook. Facebook has more than 35,000 servers to serve all these activities.


PHP 7.0: new features from the not-so-distant past
I will publish other posts about PHP 7.1 next week but, to conclude this one, here’s a quick flashback listing some nice features PHP 7.0 brought us last year.
After all, these are available in PHP 7.1 too, and, as many are still using PHP 5, they constitute additional reasons for switching to PHP 7.x;-)
PHP 7.1: changes to types
One of the most important changes PHP 7.0 brought us last year was about typing, with the introduction of scalar type-declarations for functions/methods parameters and their return value.
PHP 7.1 adds to those type-declarations, with several points that were missing in the previous version of the language.
PHP 7.1: enhancements to errors handling
PHP 7.1 brings a couple of enhancements to errors and exceptions handling.
PHP 7.1: testing a pre-release
Knowing PHP 7.1 will bring us some nice new features is great. But being able to play with those by ourselves is much better!
Like with any PHP version, you can proceed several ways. You can compile PHP yourself, look for a docker image built by a community member, or trust a maintainer who could already have packaged PHP 7.1 for your distro.
Using Elasticsearch with PHP a complete guide - PHP Scripts – Web Development Blog
Elasticsearch is an open-source full-text search engine which allows you to store and search data in real time. You can search for phrases as well and it will give you the results within seconds depending on how large the Elasticsearch database is.
PHP 7.1: introduction and release cycle
A new minor version of PHP is just around the corner: PHP 7.1!
Its release date is not really set yet, as it depends on the amount of bugs that will be reported and fixed on Releases Candidates, but it should happen before the end of this year.


Laravel and Behat Using Selenium and Headless Chrome
Let’s take a look at using Codeship for Selenium and Headless Chrome testing, which is key for interacting with JavaScript features on your site. I also want to show you how to troubleshoot those rare moments when there’s an issue on the CI but not on your local build, by using Codeship’s SSH feature and Sauce Lab’s remote connections. You can see all the code here.

Web performances

Web Performance 101: Tuning Tips for Images
Visual content is an integral part of modern websites; even the most minimalist website designs rely on sharp clear images to perfect end user experience. The variety of screen sizes available today – from smartphones and tablets, to laptops and desktops- makes it important to use high quality images that render properly on any viewport. Blurry or low resolution images can make the website design look shabby, which can ultimately have a negative impact on the end user experience.

Databases Engineering

MySQL & MariaDB

Basic Housekeeping for MySQL Indexes
In this blog post, we’ll look at some of the basic housekeeping steps for MySQL indexes.
We all know that indexes can be the difference between a high-performance database and a bad/slow/painful query ride. It’s a critical part that needs deserves some housekeeping once in a while. So, what should you check?
From decimal to timestamp with MySQL
When working with timestamps, one question that often arises is the precision of those timestamps. Most software is good enough with a precision up to the second, and that’s easy. But in some cases, like working on metering, a finer precision is required.

Data Engineering & Analytics

The Probability Monad and Why it’s Important for Data Science
Very often one builds a statistical model in pieces. For example, imagine one has a binary event which may or may not occur - to work with my thematic example, a visitor arrives on a webpage and he may or may not convert. A reasonable question to ask is « if I have 100 visitors, how many of them can I expect to convert? » Assume now that I know the conversion rate lmbda; in this case the maximum likelihood point estimate for the number of conversions is 100*lmbda and the probability distribution of possible events that could occur is binom(100, lmbda) (i.e. a binomial distribution). But what happens if lmbda is not known, but instead a random variable?

Network Engineering

Using RIPE Atlas to Measure Cloud Connectivity
Internet connectivity is an important component of cloud performance. Unlike traditional data center and colocation, cloud users have little say in uplinks, peering and other connectivity related decisions. Regardless of how performant cloud systems may be, user performance will suffer if connectivity is poor.

Management & Organization

How to validate that your DevOps process is working
Within the evolution of software development, DevOps is an advanced innovation that couples with the agile development process for the lean, streamlined, and smooth deployment of software applications. Use of DevOps expands agile project development into inter-departmental operating procedures for a fundamental enhancement to the efficiency in software delivery. The goal of the DevOps blueprint is lean production throughout the entire deployment process, from planning to delivery.
As DevOps is organically both a culture and a method, definitive points on which to validate outcomes are sometimes elusive and thereby difficult to specify. The cultural aspect of transition to DevOps, as well as its collaborative flexibility, also allow for indistinct validation. Business leaders must uniquely track DevOps processes to assess cost to market and ROI. Accordingly, agile managers and business stakeholders must look towards differential insights to assess progress.
Hypothesis driven analysis
Souvent, dans des situations très différentes, on se retrouve à devoir mener des analyses. Qu’il s’agisse d’analyse technique sur le choix d’un algorithme, d’une analyse ergonomique concernant un webdesign, ou d’autre chose, il y a deux grandes façons de mener ces études.
Apprendre à reconquérir son temps
Vous êtes noyé sous vos « todo lists » et mille et une actions, plus urgentes les unes que les autres?
Triste de ne jamais trouver le temps de la prise de recul sur vos activités?
Frustré d’être devenu incapable de vous concentrer plus de 10 minutes?
Et s’il était temps de reconquérir votre temps?
Moins de temps mais pour autant…
Innovation : le bon casting au bon moment
Combien de fois a-t-on entendu cette phrase prononcée par un manager, un cadre dirigeant ou mieux un PDG. Seulement voilà, être innovant ne se décrète pas et tout le monde n’en est pas capable. L’innovation est un mantra, répété par beaucoup, qui perd de son sens au fil du temps, et fait fantasmer énormément. Pourtant, les premiers concernés savent bien que travailler dans l’innovation, c’est souvent ingrat et que la gloire y est incertaine. Pour obtenir un résultat, il faut savoir remettre 20 fois son ouvrage sur l’établi, oser beaucoup, se tromper souvent et parfois n’arriver qu’à de tous petits résultats.