Mon blog-notes à moi que j'ai

Blog personnel d'un sysadmin, tendance hacker

Compilation veille Twitter & RSS #2016-22

La moisson de liens pour la semaine du 30 mai au 3 juin 2016. Ils ont, pour la plupart, été publiés sur mon compte Twitter. Les voici rassemblés pour ceux qui les auraient raté.

Bonne lecture

Security & Privacy

Rollback or Legalisation? Mass Surveillance in France and the Snowden Paradox
On June the 5th, it will be exactly three years since Pulitzer Prize-winning journalist Glenn Greenwald wrote the first article based on the trove of secret documents disclosed by the now famous NSA whistleblower Edward Snowden. Three years that saw the unfolding of an unprecedented controversy on the surveillance capabilities of the world’s most powerful intelligence agencies, thanks to the combined work of investigative journalists, computer experts, lawyers, activists and scholars. Since 2013, France is the first liberal European regime to undergo a vast reform of its legal framework regulating secret state surveillance. What follows is a reader’s digest of research presented at the 7th Biennial Surveillance & Society Conference.
Cookies vs Tokens: The Definitive Guide
TL;DR Tokens-based authentication is more relevant than ever. We examine the differences and similarities between cookie and token-based authentication, advantages of using tokens, and address common questions and concerns developers have regarding token-based auth. Finally, putting theory to practice, we’ll build an application that uses token authentication and make it a progressive web app.
We will be writing an Angular 2 app that uses JWT for authentication. Grab the Github repo if you would like to follow along.
Le client letsencrypt devient certbot
Ceci avait été annoncé depuis un bon moment, le client « officiel » pour la CA Let’s Encrypt devait changer de nom, justement pour éviter de laisser croire qu’il y a un client officiel qui est plus mieux que les autres.
Security challenges for the Qubes build process
Security of the build and distribution process is something that is notoriously ignored by many open source projects (see below). In Qubes, however, we have been paying lots of attention to this problem since the very beginning.

System Engineering

Introduction à Kubernetes
Dans la grande guerre des orchestrateurs de containers, Kubernetes, solution créée par Google, propose des fonctionnalités très intéressantes. Ça fait un moment que je voulais tester, mais je n’en avais pas encore eu l’occasion. J’ai donc profité de l’offre de Google Cloud Engine pour sauter le pas (deux mois gratuits).
Safety Check: Streamlining deployment around the world
Every day, more than a billion people turn to Facebook to connect with one another. This is especially true during times of crisis, when communication is critical both for those in the affected area and for those far away who are anxious for news from their loved ones.
After years of seeing people in crisis situations turn to Facebook to let friends know they’re safe, we launched Safety Check in 2014 as a tool to help people stay in touch when major disasters strike. Safety Check helps you share your status with friends and family, check on others in the affected area, and mark your friends as safe.
Dynomite-manager: Managing Dynomite Clusters
Dynomite has been adopted widely inside Netflix due to its high performance and low latency attributes. In our recent blog, we showcased the performance of Dynomite with Redis as the underlying data storage engine. At this point (Q2 2016), there are almost 50 clusters with more than 1000 nodes, centrally managed by the Cloud Database Engineering (CDE) team. CDE team has a wide experience with other data stores, such as Cassandra, ElasticSearch and Amazon RDS.
Presenting Torus: A modern distributed storage system by CoreOS
Persistent storage in container cluster infrastructure is one of the most interesting current problems in computing. Where do we store the voluminous stream of data that microservices produce and consume, especially when immutable, discrete application deployments are such a powerful pattern? As containers gain critical mass in enterprise deployments, how do we store all of this information in a way developers can depend on in any environment? How is the consistency and durability of that data assured in a world of dynamic, rapidly iterated application containers?
Meson: Workflow Orchestration for Netflix Recommendations
At Netflix, our goal is to predict what you want to watch before you watch it. To do this, we run a large number of machine learning (ML) workflows every day. In order to support the creation of these workflows and make efficient use of resources, we created Meson.


Monitoring A/B experiments in real-time
As a data driven company, we rely heavily on A/B experiments to make decisions on new products and features. How efficiently we run these experiments strongly affects how fast we can iterate. By providing experimenters with real-time metrics, we increase our chance to successfully run experiments and move faster.
The Future of Incident Notification in the Modern Enterprise
When the telephone first came out, making a call required the assistance of an operator. There was no direct dial. People picked up the phone and said, « Connect me to the Smith house. » In fact, people probably knew the operator by name. They’d say, « Madge, connect me to the Smith house, » and Madge would reply, « OK, hang on. »
Monitoring Cassandra at Scale
At Yelp we leverage Cassandra to fulfill a diverse workload that seems to combine every consistency and availability tradeoff imaginable. It is a fantastically versatile datastore, and a great complement for our developers to our MySQL and Elasticsearch offerings. However, our infrastructure is not done until it ships and is monitored. When we started deploying Cassandra we immediately started looking for ways to properly monitor the datastore so that we could alert developers and operators of issues with their clusters before cluster issues became site issues. Distributed datastores like Cassandra are built to deal with failure, but our monitoring solution had to be robust enough to differentiate between routine failure and potentially catastrophic failure.

Software Engineering

Algolia’s top 10 tips to achieve highly relevant search results
As a hosted-search engine service, we discuss the relevance aspect of search with our customers and prospects all day long. We now have more than 1500 customers and have seen a large variety of real-life search problems. It’s interesting to note that more often than not, these problems are in some way connected to the R word. Relevance.
Powering Continuous Delivery With Feature Flags
Separating feature rollout from code deployment to mitigate risk in continuous delivery
We are in the era of continuous delivery, where we are expected to quickly deliver software that is stable and performant. We see development teams embracing a suite of continuous integration/delivery tools to automate their testing and QA, all while deploying at an accelerated cadence.
Normaliser une adresse avec ElasticSearch et la base adresse nationale
Il y a fort fort longtemps, j’intégrais la base adresse nationale dans elasticsearch via logstash. Je ne suis pas allé plus loin faute de temps et peut-être d’envie.
How to Optimize Test Coverage in the Long Term
We want to test as much code as humanly (or mechanically) possible, right? Yes and no. For each test cycle, it’s important to consider multiple strategies for measuring test coverage and to put a system into place where it can be maximized in the long-term as well.

Databases Engineering


What’s New in Vertica Version 7.2.3?
Last week we announced the release of Vertica version 7.2.3. Watch this video to learn all about what’s new


Lost in Translation: Boolean Operations and Filters in the Bool Query
With 5.0 on the horizon, a number of query types deprecated in 2.x will be removed. Many of those are replaced by functionality of the bool query, so here’s a quick guide on how to move away from filtered queries; and, or, not queries; and a general look into how to parse boolean logic with the bool query.
Quick Start Guide - Configuring Elasticsearch with Shield and Active Directory
When learning a new system, I always find it useful to have instructions on how to install and configure a feature with minimum steps in a cookbook style format. It allows me to get up and running quickly without having to reference several pages with the multitude of optional settings. From that point, I can take a look at the reference manuals and the more advanced options to configure the components according to my end architecture requirements.

MySQL & MariaDB

Galera warning « last inactive check »
In this post, we’ll discuss the Galera warning « last inactive check » and what it means.
MySQL spatial functionality - points of interest around me
This week I was preparing the exercises for our MySQL/MariaDB for Beginners training. One of the exercises of the training is about MySQL spatial (GIS) features. I always tell customers: « With these features you can answer questions like: Give me all points of interest around me! »
What is a big innodb_log_file_size?
In this post, we’ll discuss what constitutes a big innodb_log_file_size , and how it can affect performance.
In the comments for our post on Percona Server 5.7 performance improvements, someone asked why we use innodb_log_file_size=10G with an indication that it might be too big?

Data Engineering & Analytic

New in CDH 5.7: Improved Performance, Security, and SQL Experience in Hue
CDH 5.7 includes a lot of changes (more than 1,500) to Hue, the Web UI that makes Apache Hadoop easier to use.
In this new release, the emphasis on performance and security carries over from 5.5. The overall improvement in the SQL user experience is also considerable.
In this post, we’ll cover some highlights.
Movie Recommendation: Should I trust IMDb?
Hi there, nice comments about the top 12 secret shortcuts of Dataiku DSS convinced me to write another one: this time it’s about plugins, movies, APIs, and how to choose the best one to watch with friends.

Network Engineering

Project Falco Contributes to OpenSwitch, a Linux Foundation Project
We recently wrote a blog post on Project Falco, our new disaggregated switching platform that is part of Altair, LinkedIn’s next-generation data center design. Our vision is to build a programmable data center fabric on top of an open network operating system. While scaling our data centers out, we want to control the complexity of data center fabric by moving toward a fully-automated, self-healing, and purpose-built application-centric network that operates on its own. By building a native Linux-based network operating system with open interfaces, it is now possible to manage switches and extend visibility, controls, and applications to network elements in the same way we do on servers.
Anycast vs. DDoS - Evaluating the November 2015 Root DNS Event
IP anycast has been widely used to replicate services in multiple locations as a way to deliver better performance and resilience. It has been largely employed by CDNs and DNS operators, such as the Root DNS system. However, there is little evaluation of anycast under stress.

Management & Organization

Designed for Collaboration: Helping Engineers Be Awesome Together
New Relic hires smart people. You already knew that, right? But smart is just the beginning. What makes our engineers truly special is their commitment to collaboration—their enthusiasm for working together to achieve the best possible results.
Composition d’une équipe technique produit
Dis, on met quoi dans une équipe technique ?
Ça dépend du temps, du produit, des besoins. Voici ma recette par défaut, à réagencer en fonction de la réalité. Il reste qu’à chaque fois je finis par me dire que j’aurais aimé la voir suivre ce schéma