Mon blog-notes à moi que j'ai

Blog personnel d'un sysadmin, tendance hacker

Compilation veille Twitter & RSS #2016-49

La moisson de liens pour la semaine du 5 au 9 décembre 2016. Ils ont, pour la plupart, été publiés sur mon compte Twitter. Les voici rassemblés pour ceux qui les auraient raté.

Bonne lecture

Security & Privacy

Why closing port 80 is bad for security
We’ve made some pretty big steps in our transition to a secure web but one thing that I often get asked about is closing port 80 as part of that transition. Here are my thoughts on why we shouldn’t do that.
Silence : XMPP, chiffrement et méta-données
Silence est une application libre (GPLv3) pour Android de SMS et MMS, permettant de chiffrer les communications avec les autres utilisateurs de Silence. Silence vous permet donc d’envoyer du texte et des images en tout sécurité, mais le texte et les images passeront en clair par les réseaux vers les utilisateurs classiques. Cette application est disponible sous forme de code source sur GitHub et binaire sur F-Droid et Play Store.
Progress on Privacy
The internet didn’t come with privacy, any more than the planet did. But at least the planet had nature, which provided raw materials for the privacy technologies we call clothing and shelter. On the net, we use human nature to make our own raw materials. Those include code, protocols, standards, frameworks and best practices, such as those behind free and open-source software.
Protection des données : le chiffrement ne suffit pas
Dans les protocoles de protection des données, même le plus robuste des chiffrements devient une ligne Maginot si les autres éléments du protocole sont faibles. C’est une des questions à l’ordre du jour du colloque « Sécurité informatique : mythes et réalité », organisé par le CNRS les 8 et 9 décembre à Paris.
Email Security - DMARC
The last in the email security series, DMARC, or Domain-based Message Authentication, Reporting and Conformance, builds on both SPF and DKIM. Improving security further and allowing reporting, you can monitor your domain for fraudulent or spoofed emails to take action.

System Engineering

Uber Engineering’’s Durable and Scalable Task Queue in Go
Cherami is a distributed, scalable, durable, and highly available message queue system we developed at Uber Engineering to transport asynchronous tasks. We named our task queue after a heroic carrier pigeon with the hope that this system would be just as resilient and fault-tolerant, allowing Uber’s mission-critical business logic components to depend on it for message delivery.
Secure USB boot with Debian
The moment you leave your laptop, say in a hotel room, you can no longer trust your system as it could have been modified while you were away. Think you are safe because you have a crypted disk? Well, if the boot partition is on the laptop itself, it can be manipulated and you will not notice because the boot partition can’t be encrypted. The BIOS needs to access the MBR and boot loader and that loads the Linux kernel, all uncrypted. There has been some reports lately that the Linux cryptsetup is insecure because you can spawn a root shell by hitting the enter key for 70 seconds. This is not the real threat to your system, really. If someone has physical access to your hardware, he can get a root shell in less than a second by passing init=/bin/bash as parameter to the Linux kernel in the boot loader regardless if cryptsetup is used or not! The attacker can also use other ways like booting a live system from CD/USB etc. The real insecurity here is that the uncrypted boot partition and not some script that gets executed from it. So how to prevent this physical access attack vector? Just keep reading this guide.
How we made diff pages three times faster
We serve a lot of diffs here at GitHub. Because it is computationally expensive to generate and display a diff, we’ve traditionally had to apply some very conservative limits on what gets loaded. We knew we could do better, and we set out to do so.
HTTP/2 Push: The details
HTTP/2 (h2) is here and it tastes good! One of the most interesting new features is h2 push, which allows the server to send data to the browser without having to wait for the browser to explicitly request it first.

Monitoring

Introducing Chaperone: How Uber Engineering Audits Kafka End-to-End
As Uber continues to scale, our systems generate continually more events, interservice messages, and logs. Those data needs go through Kafka to get processed. How does our platform audit all these messages in real time?

Software Engineering

Lessons in resilience at SoundCloud
Building and operating services distributed across a network is hard. Failures are inevitable. The way forward is having resiliency as a key part of design decisions.
This post talks about two key aspects of resiliency when doing RPC at scale - the circuit breaker pattern, and its power combined with client-side load balancing.
Transitioning to Python 3
The Python language, which is not new but continues to gain momentum and users as if it were, has changed remarkably little since it first was released. I don’t mean to say that Python hasn’t changed; it has grown, gaining functionality and speed, and it’s now a hot language in a variety of domains, from data science to test automation to education. But, those who last used Python 15 or 20 years ago would feel that the latest versions of the language are a natural extension and evolution of what they already know.
Securing Microservices: A Brief Look at Different Technologies
In a microservices architecture, a set of fine-grained services interact which each other to build an application or fulfill a business functionality. Each finely grained service implements a single function or a few related functions accessible over a network. This leads to an increased attack surface, making the security of a microservices architecture very important.

Databases Engineering

MySQL & MariaDB

Mysql 8.0: UUID support
In MySQL 8.0.0 we introduced many new features; among those, three new functions that ease and enhance the support for working with UUIDs.
MySQL 8.0: Storing IPv6
In MySQL 8.0.0 we introduced many new features; among those, we extended the bit-wise operations to work with binary data. Because of these changes, storing and manipulating IPv6 addresses can be done in an easier manner. In this blog we will take a look at how can you do this for some of the most common use cases.

Data Engineering & Analytics

Personalized Recommendations in LinkedIn Learning
We recently launched LinkedIn Learning, an online learning platform that enables students and professionals to take courses and learn the skills required to meet their career goals. As part of this platform, we provide personalized course recommendations. A/B testing indicates that we have 58% higher engagement rate when we provide personalized recommendations compared to generic or randomized recommendations.
Achieving a 300% speedup in ETL with Spark
A common design pattern often emerges when teams begin to stitch together existing systems and an EDH cluster: file dumps, typically in a format like CSV, are regularly uploaded to EDH, where they are then unpacked, transformed into optimal query format, and tucked away in HDFS where various EDH components can use them. When these file dumps are large or happen very often, these simple steps can significantly slow down an ingest pipeline. Part of this delay is inevitable; moving large files across the network is time-consuming because of physical limitations and can’t be readily sped up. However, the rest of the basic ingest workflow described above can often be improved.
Beginners Guide to Regression Analysis and Plot Interpretations
If you are aspiring to become a data scientist, regression is the first algorithm you need to learnmaster. Not just to clear job interviews, but to solve real world problems. Till today, a lot of consultancy firms continue to use regression techniques at a larger scale to help their clients. No doubt, it’s one of the easiest algorithms to learn, but it requires persistent effort to get to the master level.

Network Engineering

Building and scaling the Fastly network, part 2: balancing requests
The primary challenge of load balancing HTTP requests is derived from an unassailable constraint: if a packet belonging to an established TCP connection is forwarded to an incorrect server, the associated TCP flow will be reset. Unfortunately, the network layer does not understand the concept of a flow any more than applications understand the notion of packets. In the following paragraphs, we’ll outline the three traditional approaches to load balancing requests, all of which are ill-suited for a general-purpose CDN.

Management & Organization

Top 6 DevOps Metrics that Enterprise Dashboards Should Capture
Testing for websites/web applications is a constant challenge and with bubbling issues related to compatibility and security, there is a rising need for continuous development and improvement. Effective collaboration between testers and developers is becoming increasingly essential to meet the Quality Assurance (QA) goals.
Day 4 - Change Management: Keep it Simple, Stupid
I love change management. I love the confidence it gives me. I love the traceability–how it’s effectively a changelog for my environment. I love the discipline it instills in my team. If you do change management right, it allows you to move faster. But your mileage may vary.