Mon blog-notes à moi que j'ai

Blog personnel d'un sysadmin, tendance hacker

Compilation veille Twitter & RSS #2016-29

La moisson de liens pour la semaine du 18 au 23 juillet 2016. Ils ont, pour la plupart, été publiés sur mon compte Twitter. Les voici rassemblés pour ceux qui les auraient raté.

Bonne lecture

Security & Privacy

SSH-Ident : gérez proprement vos agents SSH
Si comme moi vous êtes admin sys, vous utilisez forcément SSH au quotidien pour gérer votre parc (sauf s’il est sous Windows, auquel cas recevez toutes mes condoléances…). Et si vous êtes un vrai admin sys, vous utilisez aussi forcément une clef SSH pour vous connecter plutôt qu’un mot de passe. Sauf qu’il est plus que pénible d’avoir à saisir sa phrase de passe avant chaque connexion à une machine, et donc il est plus que probable que vous utilisiez aussi un agent SSH pour charger vos clefs une bonne fois pour toutes à la première connexion.
Testing the HSTS preload process
My registrar had an offer on domains so I figured I’d grab one and test out the HSTS preload process as it currently stands. I want to track how easy it is to preload and how long it takes for full browser coverage in vendor preload lists.
Authentifiez-vous sans mot de passe grâce à XMPP!
L’authentification HTTP via XMPP est une extension du protocole XMPP (XEP).
Elle permet de s’authentifier sur un site Internet sans avoir besoin de mot de passe : le site en question envoie une demande de confirmation à l’utilisateur du compte XMPP qui autorise ou non l’accès.
Handling Cross-Site Scripting As Attacks Get More Sophisticated
Adopting third-party libraries to encode user input in the development phase and using a web application firewall in the deployment phase could fool web security managers into thinking their web applications are completely safe from Cross-Site Scripting (XSS) attacks. While it’s a good idea to employ these techniques, the illusion of safety could prove costly. These protection methods do not guarantee that your web applications are 100% free of XSS vulnerabilities, and XSS attacks that use more sophisticated techniques still occur, so care should still be taken.
How DevOps Can Help Improve Security
The term « DevOps » these days tends to evoke images of automation and push-button application deployments whenever app dev wants. It’s anarchy, it’s chaos, and it’s a frightening notion to those for whom stability and security of the core business network is their top priority. After all, folks in the DevOps camp routinely cheer on a « Chaos Monkey, » whose sole purpose is to break things in the production network. On the surface, that hardly seems conducive to stability or security.
A Quick, Simple Guide to Tor and the Internet of Things (So Far)
« The Internet of Things » is the remote control and networking of everyday devices ranging from a family’s lawn sprinkler or babycam to a corporation’s entire HVAC system.
Tor Project contributor Nathan Freitas, Executive Director of The Guardian Project, has developed a new way to use Tor’s anonymous onion services to protect the « Internet of Things. » The new system, while experimental, is also scalable.

System Engineering

A Beginner’s Guide to Logstash Grok
The ability to efficiently analyze and query the data being shipped into the ELK Stack depends on the information being readable. This means that as unstructured data is being ingested into the system, it must be translated into structured message lines.
The Role of A Container Cluster Manager
Almost all the container runtimes have been designed to run containers on a single container host. This is by design; containers share the host operating system kernel and features such as cgroups, namespaces, chroot, SELinux & seccomp, etc for providing the isolation and security. Therefore a given set of containers may need to run on a single container host. At the moment none of the container runtimes available today provide a mechanism for integrating multiple container hosts together for sharing the workload except by using a container cluster manager. Figure 1 illustrates how a software solution deployed on a set of VMs can be moved to a containerized environment using a single container host:
Building Highly Scalable V6 Only Cloud Hosting
This article is about how we built the new high scalable cloud hosting solution using IPv6-only communication between commodity servers, what problems we faced with IPv6 protocol and how we tackled them for handling more than ten millions active users.
Open19: A New Vision for the Data Center
This week at the DatacenterDynamics Webscale conference, I have the privilege of announcing the first step in a long journey, an initiative that we hope will further the state of hardware in the data center. Called Open19, this new project aims to establish a new open standard for servers based on a common form factor. The goals of Open19 are to provide lower cost per rack, lower cost per server, optimized power utilization, and (eventually) an open standard that everyone can contribute to and participate in.
The Uber Engineering Tech Stack, Part I: The Foundation
Uber’s mission is transportation as reliable as running water, everywhere, for everyone. To make that possible, we create and work with complex data. Then we bundle it up neatly as a platform that enables drivers to get business and riders to get around.
The Uber Engineering Tech Stack, Part II: The Edge and Beyond
Uber’s mission is transportation as reliable as running water, everywhere, for everyone. Last time, we talked about the foundation that powers Uber Engineering. Now, we’ll explore the parts of the stack that face riders and drivers, starting with the world of Marketplace and moving up the stack through web and mobile.
Quelques retours sur Grafana après une semaine
L’article sur le monitoring de synchronisation ADSL a eu son petit succès, et pourtant il est très imparfait. Déjà, ce n’est qu’un exemple qui ne s’applique dans l’immédiat qu’aux possesseurs de Freebox V5/Crystal. Ensuite, je ne m’en sers que pour ça, je ne stocke absolument rien à propos du Raspberry Pi qui héberge tout ça et fait le travail de collection, tout est en local, bref, un premier jet bien feignant.

Monitoring

Hadoop architectural overview
In this post, we’ll explore each of the technologies that make up a typical Hadoop deployment, and see how they all fit together. If you’re already familiar with HDFS, MapReduce, and YARN, feel free to continue on to Part 2 to dive right into Hadoop’s key performance metrics.
How to monitor Hadoop metrics
If you’ve already read our guide to Hadoop architecture, you know about the components that make up a typical Hadoop cluster. In this post, we’ll dive deep into each of the technologies introduced in that guide and explore the key metrics exposed by Hadoop that you should keep your eye on.
Metric graphs 101: Graphing anti-patterns
In the first two parts of this series, we introduced you to several different visualization types—both timeseries graphs that have time as the x-axis, and summary graphs that provide a summary view of a time window.
In this article we show you three ways that these visualizations are often misused and then suggest better solutions.

Software Engineering

SuperRoot: Launching a High-SLA Production Service at Twitter
Our Search Infrastructure team is building a new information retrieval system called Omnisearch to power Twitter’s next generation of relevance-based, personalized products. We recently launched the first major architectural component of Omnisearch, the SuperRoot. We thought it would be interesting to share what’s involved with building, productionizing, and launching a new high-scale, high-SLA distributed system at Twitter.
What is Loose Coupling?
A couple days ago, I had a developer ask me what it meant to have tightly coupled code and how do you detect it.
On Tuesday, I saw a similar question on Quora asking if there was any tool to analyze layers in application architecture.
Since these two subjects relate to each other, I thought I would provide a solid resource (excuse the pun) to explain every aspect of loosely coupled systems.
So where do we begin? How about the obvious?
Can Containers Really Ship Software?
May be by copying to a disk? If you are thinking why in this world we need containers to ship software when the internet is out there, this is not about containers that ship goods, rather this is about Linux containers which provides an operating system level virtualization technology for creating an isolated environment similar to virtual machines for running software applications.
A Quick Introduction to HTTP-RPC
HTTP-RPC is an open-source framework for simplifying development of REST-based applications. It allows developers to create and access HTTP-based web services using a convenient, RPC-like metaphor while preserving fundamental REST principles such as statelessness and uniform resource access.
The project currently includes support for implementing REST services in Java and consuming services in Java, Objective-C/Swift, or JavaScript. The server component provides a lightweight alternative to other, larger Java-based REST frameworks, and the consistent cross-platform client API makes it easy to interact with services regardless of target device or operating system.
This article provides an introduction to the HTTP-RPC framework along with sample code demonstrating the implementation of a simple HTTP-RPC service in Java. Client examples in Swift, Java, and JavaScript are also provided.

Web performances

The « Goal » of Performance Tuning
My first co-op job was working in packaging. They had a small Industrial Engineering library and let me read from it at work. The first book I read changed my life and my way of thinking about problem solving: The Goal by Eliyahu M. Goldratt. Just about every time I work on performance, it’s impossible for me to not make comparisons to The Goal in my head. In this post, we’ll look at some stories from the book and how they apply to performance tuning in programming.
How third party resource analytics help digital businesses gain control
What percentage of the total page resources that your web or mobile property delivered yesterday were third party resources, such as ads, social media badges, monitoring scripts, or tracking tools? Exactly how big is the impact of third parties on key metrics such as conversion rates, bounce rates, and page views?

Mobile

Protocol Buffers: Benchmark et utilisation sur mobile
Aller de plus en plus vite sur smartphone est devenu essentiel. Au delà du moyen de communication, le format de données utilisé joue un rôle sur la vitesse. Le JSON est aujourd’hui standard pour les API. Mais ce format de donnée est-il adapté au mobile? La manipulation d’un JSON en Android, par exemple, n’est pas simple.
D’autres formats de données émergent depuis quelques années comme Thrift, Avro, Message Pack ou encore Protocol Buffers.

Databases Engineering

Elasticsearch

Running site plugins with Elasticsearch 5.0
Way back in Elasticsearch 0.17, Elasticsearch gained the ability to serve static web pages, and site plugins were born. Site plugins allowed users to write Javascript applications which provide graphical user interfaces to Elasticsearch.

Vertica

What’s New in 7.2.3: New Apache Hadoop Integration Features
We’ll cover the first two features in this blog post. Look for a separate blog post about the new Parquet Reader.

Data Engineering & Analytic

How SOASTA and Google used machine learning to predict bounce rate and conversions
If you’re ever offered a chance to throw a million or so beacon’s worth of user data into a Google-developed machine-learning system, I highly recommend you say yes. That’s what we did here at SOASTA, when Google approached us a year ago about partnering on a pioneering research project. The results have been eye-opening.
In this post, I’ll walk through a bit of the methodology we used, as well as some of the highlights of our findings. I’ll also share the machine learning code we used, which we’ve open sourced, as well as a few tips we picked up during this project.
Routing Data from Docker to Prometheus Server via Fluentd
Possibly the best way to build an economy of scale around your framework, whatever it is, is to build up your library of integrations – or integrators – and see what and who your new partners can bring into the mix.
In this blog, we’ll trace the steps to connect Fluentd to a Docker container to route stdout commands (our data) to Prometheus. (Prometheus could be similarly configured on Google Cloud Platform, CoreOS or even Kubernetes). Later, we’ll also query Prometheus for that data.

Network Engineering

Harnessing light for wireless communications
As part of our connectivity efforts, Internet.org is working on ways to connect the 4 billion people who are currently offline. Of those 4 billion, we know that 1.6 billion live in sparsely populated areas without broadband wireless infrastructure, and connecting them will require very different technologies than the solutions currently used. The Connectivity Lab team at Facebook is working to solve this problem, and has introduced a number of new technologies aimed at bringing better connectivity to these populations.
Internet Access Disruption In Turkey - July 2016
With the attempted coup in Turkey, reports went out about social media being throttled and/or blocked. We analysed data about this that we collected with RIPE Atlas and OONI.
On 15 July, a coup was attempted in Turkey. We heard about social media being throttled and/or blocked, but much was unclear about what was actually going on. Here we present measurement data from various platforms that shared their data publicly.

Management & Organization

How Our Agile Review Process Builds Better Engineers Faster
When it comes to coaching and reviews, New Relic engineering strives for fast iteration and mutual trust.
In New Relic’s product engineering organization, every individual has weekly or bi-weekly one-on-one meetings with their manager and receives two or more performance reviews a year. A friend of mine who is a workforce development trainer at a big healthcare provider was surprised to learn how much time we spend on this. But it actually makes good sense.
Software is a high-leverage business and a creative pursuit. Engineers are more satisfied when they’re improving their craft, and even small improvements in an engineer’s abilities can pay big dividends. We invest heavily in personal growth because the compounded interest on that increased effectiveness helps liberate everyone from the tyranny of micromanagement and creates tremendous value.
Prometheus has come of age – a reflection on the development of an open-source project
On Monday this week, the Prometheus authors have released version 1.0.0 of the central component of the Prometheus monitoring and alerting system, the Prometheus server. (Other components will follow suit over the next months.) This is a major milestone for the project. Read more about it on the Prometheus blog, and check out the announcement of the CNCF, which has recently accepted Prometheus as a hosted project.
How Does Google do Planet-Scale Engineering for a Planet-Scale Infrastructure?
How does Google keep all its services up and running? They almost never seem to fail. If you’ve ever wondered we get a wonderful peek behind the curtain in a talk given at GCP NEXT 2016 by Melissa Binde, Director, Storage SRE at Google: How Google Does Planet-Scale Engineering for Planet-Scale Infrastructure.