Mon blog-notes à moi que j'ai

Blog personnel d'un sysadmin, tendance hacker

Compilation veille Twitter & RSS #2016-25

La moisson de liens pour la semaine du 20 au 24 juin 2016. Ils ont, pour la plupart, été publiés sur mon compte Twitter. Les voici rassemblés pour ceux qui les auraient raté.

Bonne lecture

Security & Privacy

How to Apply DevOps Culture to Security & Why You Should Do It
Unless you’ve been living under a rock (or don’t work in the tech industry), you’ve probably heard the term DevOps thrown around. A mashup of « development » and « operations », DevOps is a mindset and set of practices that focus on collaboration and communication between software developers and other IT professionals with the goal of automating both software delivery and infrastructure changes.
Progress Towards 100% HTTPS, June 2016
Our goal with Let’s Encrypt is to get the Web to 100% HTTPS. We’d like to give a quick progress update.
Let’s Encrypt has issued more than 5 million certificates in total since we launched to the general public on December 3, 2015. Approximately 3.8 million of those are active, meaning unexpired and unrevoked. Our active certificates cover more than 7 million unique domains.
Granting Temporary Access to Your Servers (Using Signed SSH Keys)
In need of support from a colleague or vendor, but don’t want to give them permanent access? SSH has an option to allow temporary access! Next time you need to provide temporary access for an hour or day, use this great option.
Cyberattaque : quelle est la meilleure défense?
Depuis 3 ans, le projet ADAX tente de déterminer les meilleures contre-mesures à adopter en temps réel en cas de cyberattaque. En avril dernier, il recevait le prix ITEA Business Impact pour récompenser les résultats obtenus par ses chercheurs, dont ceux de Télécom SudParis et de Télécom Bretagne.
Le gâchis
Enfin! Après 4 ans de travail et 5M€ de financement public, le projet d’antivirus « souverain » vient d’être publié sur GitHub.
Initialement connu sous le nom de DAVFI (Démonstrateurs d’AntiVirus Français et Internationaux), puis Uhuru Anti Malware (marque commerciale), le projet Open Source s’appelle désormais Armadito – est-ce un clin d’œil à la protection logicielle bien connue - mais désormais obsolète - « Armadillo »?
The DAO, Ethereum, et l’attaque de juin 2016
Vendredi 17 juin 2016, une attaque spectaculaire contre l’organisation The DAO a eu lieu, menant à la soustraction d’environ un tiers de ses fonds. Quelles leçons à en tirer?

System Engineering

Economical With The Truth: Making DNSSEC Answers Cheap
We launched DNSSEC late last year and are already signing 56.9 billion DNS record sets per day. At this scale, we care a great deal about compute cost. One of the ways we save CPU cycles is our unique implementation of negative answers in DNSSEC.
Ssh from the ground up
If you work professionally in the IT industry, chances are you’ve been using OpenSSH for a long time now for your day to day work.
OpenSSH however provides so much more than « just » remote shell on *nix system (and apparently on Windows too now!) and in this article we’re going to explore some of the non immediate uses of ssh and introduce a few accessory tools that make using ssh even better.
Run CoreOS on FreeBSD’s bhyve
No, I’m not following the hype, only I like to test things plus I feel there will be a growing demand for docker at ${DAYWORK}. I read here and there that CoreOS was the Linux distribution of choice to play with docker, so while at it, I picked up this one to dive into the container worlds.
Netflix Billing Migration to AWS
On January 4, 2016, right before Netflix expanded itself into 130 new countries, Netflix Billing infrastructure became 100% AWS cloud-native. Migration of Billing infrastructure from Netflix Data Center(DC) to AWS Cloud was part of a broader initiative. This prior blog post is a great read that summarizes our strategic goals and direction towards AWS migration.
Running Services in Docker 1.12
Great news! Docker for Windows and Mac is now in public beta, which means that Docker is that much easier to use for local development regardless of your preferred environment. You can download your preferred flavor of Docker at docker.com/getdocker.
Starting today, Docker for Mac and Windows also ships with Docker v1.12-rc2. That’s one step closer to making Docker a really powerful tool for orchestrating your application.
What’s New in Docker: Swarm Mode, Built in Orchestration, Services, Healthchecks, .dab files, constraints
To kick off the first day of DockerCon in Seattle, Mike Goelzer and Andrea Luzzardi spoke about what’s new with Docker in 2016. Goelzer is the open source product management lead for Docker’s Core Runtime, and Luzzardi is a Software Engineer at Docker and was part of the original team that built the project. The biggest announcement definitely was the release of Docker 1.12.

Monitoring

Monitor Postfix queue performance
Postfix is an open source, SMTP-based mail transfer agent (MTA) first released in 1997 that continues to be a popular choice for organizations that need an affordable way to route and deliver email. By using Postfix, organizations also have an initial layer of protection against common email issues like spambots and malware.
Datadog integrates with Postfix to ensure you have all the necessary information to monitor your email service’s performance.
Container Design Patterns
Kubernetes automates deployment, operations, and scaling of applications, but our goals in the Kubernetes project extend beyond system management – we want Kubernetes to help developers, too. Kubernetes should make it easy for them to write the distributed applications and services that run in cloud and datacenter environments. To enable this, Kubernetes defines not only an API for administrators to perform management actions, but also an API for containerized applications to interact with the management platform.
Announcing enterprise-grade enterprise monitoring software
TL;DR: You asked, we delivered. Now you can get Sysdig container monitoring technology as software that runs in your private cloud or virtual private cloud, in addition to our existing cloud service.
Service Discovery: monitoring Docker containers that move
Docker is being adopted rapidly, and for good reason: it simplifies many aspects of running a service in production. But Docker-powered services typically run many more containers than traditional services run hosts, so monitoring is much more complex. Now with platforms like Kubernetes and ECS orchestrating your containers, you may not even know which host your containers are running on—this makes monitoring your services even more complex.
package monkit
Package monkit is a flexible code instrumenting and data collection library.

Software Engineering

Powering Blue-Green Deployments with Feature Flags
Blue-green deployments have long been a proven technique to mitigate risk in software releases. By adding feature flags, developers are ushering in a new era of blue-green deployments, one with unprecedented granular control over feature releases. This article discusses how to effectively integrate feature flags into your blue-green deployment process.
At its core, a blue-green deployment is a release practice that maintains two production environments called blue and green, switching between whether the blue or green environment is live. The primary benefit of this approach is to mitigate risk and control the timing of releases. The blue version might have the new version and green the old version. If something goes wrong, you can switch back to the more stable environment.
Uber Engineering’s Micro Deploy: Deploying Daily with Confidence
In 2014, Uber began expanding ever-rapidly. Our platform grew from about 60 cities to 100 in the spring, and then to 200 in the fall. Meanwhile, our fastest growing cities were among our oldest.
As the number of additional platform engineers grew, so did the disorganization of deploying new code. Each team used custom shell scripts and then manually monitored them with tools specific to each service as new versions of its microservices were shepherded into production. When upgrading hosts went awry, it required tedious manual rollback one machine at a time. With more and more engineers working on Uber services this manual labor began wasting time and sometimes prolonged outages in our services.
Commit messages are not titles
Nor subjects, for what matters. Everybody will tell you to don’t add a dot at the end of the first line of a commit message. I followed the advice for some time, but I’ll stop today, because I don’t believe commit messages are titles or subjects. They are synopsis of the meaning of the change operated by the commit, so they are small sentences. The sentence can be later augmented with more details in the next lines of the commit message, however many times there is no body, there is just the first line. How many emails or articles you see with just the subject or the title? Very little, I guess. So for me it is like:
Retour d’expérience : réaliser des Workers en PHP - Fabien de Saint pern au PHP Tour 2016
Fabien de Saint pern - lead dev de notre team back-end 6play - était au PHP Tour et a fait une présentation sur la façon dont nous faisons des workers en PHP.
Otto: The Next Generation of Vagrant
Not so long ago, Vagrant was the prime tool that attempted to solve that time-immemorial problem of « it works on my machine. » Developers could create shareable Vagrant files to allow coworkers to spin up replica machines for testing code and the interconnecting parts of a typical modern project. Vagrant is far from dead, but it suffers from a couple of long-lasting issues, including the resource footprint of virtual machines created, the speed of sharing files between the host and virtual machine, and the speed of making configuration changes to virtual machines.
Introducing Experimental Distributed Application Bundles
The built-in orchestration features announced today with Docker 1.12 will revolutionize how IT teams build, ship and run containerized apps. With Docker 1.12, developers and ops now share a set of simple and powerful APIs, tools, and formats for building agile delivery pipelines that ship software from development through CI to production in the cloud with Docker for AWS and Azure.

Databases Engineering

Elasticsearch

Running Elasticsearch on AWS
We often talk to customers running Elasticsearch clusters on Amazon Web Services (AWS). AWS is a convenient way to provision and scale machine resources in response to changing business requirements. Elasticsearch takes advantage of EC2’s on-demand machine architecture enabling the addition and removal of EC2 instances and corresponding Elasticsearch nodes as capacity and performance requirements change.
In this article we will show you how to deploy Elasticsearch 2.3.3 on Amazon EC2. In this example we will configure a three node Elasticsearch cluster.
Just Enough Kafka For The Elastic Stack, Part 2
Welcome to part 2 of our multi-part Apache Kafka and Elastic Stack post. In our previous post, we introduced use cases of Kafka for the Elastic Stack and shared knowledge about designing your system for time based and user based data flow. In this post, we’ll focus on the operation aspects: tips for running Kafka and Logstash in production to ingest massive amounts of data.
Finding a Scalable Data Model for Search @ bol.com
Bol.com started in 1999 and has grown from an online bookstore to an online superstore with a wide variety of products, including books, segways, shoes, saunas, swimming pools, and much more. Since 2010, Bol.com has also opted to become a platform for other sellers to (re)sell their products and/or sell the same products bol.com offers but with different conditions.
Right now we have over 11 million products available, 6.2 million active customers, and about 230,000 active sellers a month.
Self-Ranking Search with Elasticsearch at Wattpad
At Wattpad search is used millions of times a day by people looking to discover stories they want to read. We use Elasticsearch to power these searches over tens of millions of documents. One of the techniques we use allows the search system to re-rank documents over time to better suit the needs of people using the system. Let’s start at the beginning and see why this technique works and a simple implementation of how it can be used in your search system.
Field notes - ElasticSearch at petabyte scale on AWS
I manage a somewhat sizable fleet of ElasticSearch clusters. How large? Well, « large » is relative these days. Strictly in ElasticSearch data nodes, it’s currently operating at the order of: several petabytes of provisioned data-node storage, thousands of Xeon E5 v3 cores, 10s of terabytes of memory, indexing many billions of events a day (24/7/365)
And growing. Individual clusters tend to range anywhere from 48TB to over a petabyte

Vertica

Counting Triangles
Recently I’ve heard from or read about people who use Hadoop because their analytic jobs can’t achieve the same level of performance in a database. In one case, a professor I visited said his group uses Hadoop to count triangles « because a database doesn’t perform the necessary joins efficiently. »
Analyze Mismatched Series with Event Series Joins
Event series occur in tables with a time column, most typically a TIMESTAMP data type. In Vertica, you perform an event series join to analyze two series in different tables when their measurement intervals don’t align, such as with mismatched timestamps.

MySQL & MariaDB

Why is varchar(255) not varchar(255)?
Recently I was working on a clients question and stumbled over an issue with replication and mixed character sets. The client asked, wether it is possible to replicate data to a table on a MySQL slave, where one column had a different character set, than the column in the same table on the master.
Docker automatic MySQL slave propagation
In this post, we’ll discuss Docker automatic MySQL slave propagation for help with scaling.
In my previous posts on the Docker environment, I covered Percona XtraDB Cluster. Percona XtraDB Cluster can automatically scale by conveniently adding new nodes using the highly automated State Snapshot Transfer. State Snapshot Transfer allows a new node to copy data from an existing node (I still want to see how this is possible with MySQL Group Replication).
Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host
This post is a continuance of my Docker series, and examines Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host.
In this blog I want to look into a lower-level building block: Linux Network Namespace.
The same as with cgroups, Docker uses Linux Network Namespace for resource isolation. I was looking into cgroup a year ago, and now I want to understand more about Network Namespace.

Data Engineering & Analytic

Lighting the way to deep machine learning
Open source Torchnet helps researchers and developers build rapid and reusable prototypes of learning systems in Torch.
Building rapid and clean prototypes for deep machine-learning operations can now take a big step forward with Torchnet, a new software toolkit that fosters rapid and collaborative development of deep learning experiments by the Torch community.
How-to: Detect and Report Web-Traffic Anomalies in Near Real-Time
This framework based on Apache Flume, Apache Spark Streaming, and Apache Impala (incubating) can detect and report on abnormal bad HTTP requests within seconds.
Website performance and availability are mission-critical for companies of all types and sizes, not just those with a revenue stream directly tied to the web. Web pages can become unavailable for many reasons, including overburdened backing data stores or content-management systems or a delay in load times of third-party content such as advertisements. In addition to the deterioration of user experience during such incidents, search engines quickly apply a significant ranking penalty to slowly loading pages that contain certain keywords. Therefore, apart from determining the root cause of page-load performance or inaccessibility after the fact, detecting problems as they occur—before long-term damage occurs—is an emerging key requirement.
The Technology Behind Apple Photos and the Future of Deep Learning and Privacy
There’s a war between two visions of how the ubiquitous AI assisted future will be rendered: on the cloud or on the device. And as with any great drama it helps the story along if we have two archetypal antagonists. On the cloud side we have Google. On the device side we have Apple. Who will win? Both? Neither? Or do we all win?
A path to unsupervised learning through adversarial networks
Humans learn by observing and experiencing the physical world. We can imagine a not-too-distant future where a complete artificial intelligence system is capable of not only text and image recognition but also higher-order functions like reasoning, prediction, and planning, rivaling the way humans think and behave. For machines to have this type of common sense, they need an internal model of how the world works, which requires the ability to predict. What we’re missing is the ability for a machine to build such a model itself without requiring a huge amount of effort by humans to train it.

Network Engineering

A post-mortem on this morning’s incident
We would like to share more details with our customers and readers on the internet outages that occurred this morning and earlier in the week, and what we are doing to prevent these from happening again.
Reducing the BGP Table Size - A Fairy Tale
The issue of the relative sizes of the IPv4 and IPv6 Internet in BGP came up during discussion at the APNIC/APRICOT meeting held in Auckland, New Zealand earlier this year.

Management & Organization

Five Tips for a More Productive Team
Imagine if you could save 30 minutes per day for each member of your team with better tooling and processes. For a team of six, that adds 15 hours to the week, and with a full 15 hours, we can optimize other parts of our system. Let’s take a look at how just a few changes can lead to a more productive team.
DevOps: Too much emphasis on the ‘Dev,’ not enough on the ‘Ops’
The DevOps team movement really started taking off in 2009, according to the Agile Admin, with the purpose of creating greater cohesion between development and operations teams. Traditionally, developers would construct very bulky software builds, and pass them off to operations for deployment and maintenance. The problem with this approach was that the development team wasn’t always savvy to operational hiccups that could degrade the functionality of software. This meant that a lot of time would have to pass between each iteration of software.
Optimizing an Agile DevOps Organization (Part 2)
An effective Agile model must incorporate operational agility. Operational agility requires a DevOps culture and mindset. Modern DevOps tools and technologies, which are essential for facilitating this culture need to be supported by an optimal Ops organization. In the first part of the series, we examined the common operations pool that works in an Agile development environment. Then, we went on to understand the general structure of a Scalable Agile delivery model.
In this part of the series, we review the organizational aspects in greater detail. How do we introduce operations activities into Agile Release Trains? Where should they be performed: sprint, program, or portfolio (or value stream)?