Mon blog-notes à moi que j'ai

Blog personnel d'un sysadmin, tendance hacker

Compilation veille Twitter & RSS #2015-24

La moisson de liens pour la semaine du 8 au 12 juin 2015. Ils ont, pour la plupart été publiés sur mon compte Twitter. Les voici rassemblés pour ceux qui les auraient raté.

Bonne lecture

Security

Introducing TLS Maturity Model
As part of my job working on SSL Labs, I spend a lot of time helping others improve their TLS security, both directly and indirectly–by developing tools and writing documentation. Over time, I started to notice that deploying TLS securely is getting more complicated, rather than less. One possibility is that, with so much attention on TLS and many potential issues to consider, we’re losing sight of what’s really important.

Big Data

Burrow: Kafka Consumer Monitoring Reinvented
One of the responsibilities of the Data Infrastructure SRE team is to monitor the Apache Kafka infrastructure, the core pipeline for much of LinkedIn’s data, in the most effective way to ensure 100% availability. We have recently developed a new method for monitoring Kafka consumers that we are pleased to release as an open source project - Burrow. Named after Franz Kafka’s unfinished short story, Burrow digs through the maze of message offsets from both the brokers and consumers to present a concise, but complete, view of the state of each subscriber.
How are recommendation engines built?
The success of Amazon and Netflix has made recommendation systems not only common but also extremely popular. For many people, the recommendation system seems to be one of the easiest applications to understand; and a majority of us use them daily.
Open Sourcing Pinot: Scaling the Wall of Real-Time Analytics
Last fall we introduced Pinot, LinkedIn’s real-time analytics infrastructure, that we built to allow us to slice and dice across billions of rows in real-time across a wide variety of products. Today we are happy to announce that we have open sourced Pinot. We’ve had a lot of interest in Pinot and are excited to see how it is adopted by the open source community.
Inside Apache HBase’s New Support for MOBs
Apache HBase is a distributed, scalable, performant, consistent key value database that can store a variety of binary data types. It excels at storing many relatively small values (<10K), and providing low-latency reads and writes.
However, there is a growing demand for storing documents, images, and other moderate objects (MOBs) in HBase while maintaining low latency for reads and writes. One such use case is a bank that stores signed and scanned customer documents. As another example, transport agencies may want to store snapshots of traffic and moving cars. These MOBs are generally write-once.

Software Engineering

Seven Deadly Sins of a Software Project
Maintainability is the most valuable virtue of modern software development. Maintainability can basically be measured as the working time required for a new developer to learn the software before he or she can start making serious changes in it. The longer the time, the lower the maintainability. In some projects, this time requirement is close to infinity, which means it is literally unmaintainable. I believe there are seven fundamental and fatal sins that make our software unmaintainable. Here they are.
Open-sourcing Facebook Infer: Identify bugs before you ship
Today, we’re open-sourcing Facebook Infer, a static program analyzer that Facebook uses to identify bugs before mobile code is shipped. Static analyzers are automated tools that spot bugs in source code by scanning programs without running them. They complement traditional dynamic testing: Where testing allows individual runs through a piece of software to be checked for correctness, static analysis allows multiple and sometimes even all flows to be checked at once. Facebook Infer uses mathematical logic to do symbolic reasoning about program execution, approximating some of the reasoning a human might do when looking at a program. We use Facebook Infer internally to analyze the main Facebook apps for Android and iOS (used by more than a billion people), Facebook Messenger, and Instagram, among others. At present, the analyzer reports problems caused by null pointer access and resource and memory leaks, which cause a large percentage of app crashes.
Keep Your Commits Small
Every developer knows the good practice of performing small commits (at least every developer who has not spent the last few years within a cave). It considerably facilitates change tracking. Commit messages are more accurate and it is easier to learn the purpose of each revision. It is easier to revert problematic changes and use the bisect mechanism to find specific commits. And if you are pushing your changes as frequently as commit, you will not lose weeks of work by an accidentally entered command or disk crash.
Developer Testing – Why should developers write tests?
The section about testing from the AngularJS documentation has the following sentence which caught my attention
What Makes For Better Software Quality? Hint: It’s More Than Just Good Code.
Over the past decade, advancements in static analysis tools from both commercial and open source communities have dramatically improved the detection of developer violations of good coding practices. The ability to detect these issues in coding practices provides the promise of better software quality.
Yet many of these static analysis tools cannot detect the critical violations that exist in multilayer architectures, across transactions and multi-technology systems. These are the violations that lead to 90% of a systems reliability, security and efficiency issues in production.
Lockdown Results and HHVM Performance
The HHVM team has concluded its first ever open source performance lockdown, and we’re very excited to share the results with you. During our two week lockdown, we’ve made strides optimizing builtin functions, dynamic properties, string concatenation, and the file cache. In addition to improving HHVM, we also looked for places in the open source frameworks where we could contribute patches that would benefit all engines. Our efforts centered around maximizing requests per second (RPS) with WordPress, Drupal 7, and MediaWiki, using our oss-performance benchmarking tool.
SQL Queries Comparison with Blackfire
One of the most popular Blackfire feature is the ability to compare profiles. We have a unique way to display comparisons, both as a call graph and as a function calls table.
Introduction to Microservices
Microservices are currently getting a lot of attention: articles, blogs, discussions on social media, and conference presentations. They are rapidly heading towards the peak of inflated expectations on the Gartner Hype cycle. At the same time, there are skeptics in the software community who dismiss microservices as nothing new. Naysayers claim that the idea is just a rebranding of SOA. However, despite both the hype and the skepticism, the Microservice architecture pattern has significant benefits – especially when it comes to enabling the agile development and delivery of complex enterprise applications.

Mobile

Make A Gallery-Like Image Grid Using Native Android
Previously I had written an article regarding how to make a gallery-like image grid using Ionic Framework, but what if we wanted to accomplish the same using the native Android SDK?
In this tutorial we’ll see how to make use of the Android GridView with an image adapter to display remote images from the internet.

System Engineering

It’s The Future
Editor Note: At the risk of spoiling the joke a bit, we want to make sure that everyone knows that the following is satire, and we’re actually quite fond of the companies we mention. Docker, CoreOS, Google, Vagrant/Hashicorp, Heroku, Aphyr, Amazon, Mongo, Redis—we love you really and mean you no harm. :) Enjoy!
27 Best Practice Tips on Amazon Web Services Security Groups
AWS Security Groups are one of the most used and abused configurations inside an AWS environment if you are using them on cloud quite long. Since AWS security groups are simple to configure, users many times ignore the importance of it and do not follow best practices relating to it. In reality, operating on AWS security groups every day is much more intensive and complex than configuring them once. Actually nobody talks about it! So in this article, i am going to share our experience in dealing with AWS Security groups since 2008 as a set of best practice pointers relating to configuration and day to day operations perspective.
Plans for Redis 3.2
I’m back from Paris, DotScale 2015 was a very interesting conference. Before leaving I was working on Sentinel in the context of the unstable branch: the work was mainly about connection sharing. In short, it is the ability of a few Sentinels to scale, monitoring many masters. Before to leave, and now that I’m back, I tried to « secure » a set of features that will be the basis for Redis 3.2. In the next weeks I’ll be focusing developing these features, so I thought it’s worth to share the list with you ASAP.
Technology Preview: CoreOS Linux and xhyve
Yesterday a new lightweight hypervisor for OS X was released called xhyve; if you are familiar with qemu-kvm on Linux, it provides a roughly similar experience. In this post we are going to show how to run CoreOS Linux under xhyve. While this is all very early and potentially buggy tech, we want to give you some tips on how to try CoreOS Linux with xhyve and run Docker or rkt on top.
How to to Backup Linux with Snapshots
While working on different web projects I have accumulated a large pool of tools and services to facilitate the work of developers, system administrators and DevOpsOne of the first challenges, that every developer faces at the end of each project is backup configuration and maintenance of media files, UGC, databases, application and servers’ data (e.g. configuration files).Nowadays, there are a lot of solutions to make a snapshot backup of the entire server, and I decided to make a list of most convinient and really useful tools and services.
Docker Monitoring Support
Containers and Docker are all the rage these days. In fact, containers — with Docker as the leading container implementation — have changed how we deploy systems, especially those comprised of micro-services. Despite all the buzz, however, Docker and other containers are still relatively new and not yet mainstream. That being said, even early Docker adopters need a good monitoring tool, so last month we added Docker monitoring to SPM. We built it on top of spm-agent – the extensible framework for Node.js-based agents and ended up with spm-agent-docker.
The VCL Cookie Monster
This month’s tip is more a theoretical exercise than anything else, just to show the power of VCL, and to explain a few regular expressions. I’m going to discuss VCL that deletes cookies.
Let’s assume you have a Varnish server that only serves static content. Ideally you don’t want the browser to send any Cookie headers, to keep the request as small as possible. This means you don’t want your Varnish to send any Set-Cookie headers to the browser.
HAProxy and HTTP Strict Transport Security (HSTS) header in HTTP redirects
Unfortunately, many applications were written for HTTP only and switching to HTTPs is not an easy and straight forward path. Read more here about impact of TLS offloading (when a third party tool perform TLS in front of your web application servers).
Docker and the Three Ways of DevOps Part 2: The Second Way – Amplify Feedback Loops
In the previous post in this series, we discussed patterns of applying DevOps principles in a way that yields high performance outcomes. In part two, we are going to discuss what is called « The Second Way ». This second « way » is ultimately about amplifying and shortening feedback loops such that corrections can be made fast and continuously. This is sometimes referred to as the right to left flow.
Docker Security: More than Meets the Eye
A week or two ago The Register carried a story about a Docker image security audit a company called BanyonOps had carried out. BanyonOps published its findings in a blog post, and frankly, they are quite alarming.
Docker Monitoring Support
Containers and Docker are all the rage these days. In fact, containers — with Docker as the leading container implementation — have changed how we deploy systems, especially those comprised of micro-services. Despite all the buzz, however, Docker and other containers are still relatively new and not yet mainstream. That being said, even early Docker adopters need a good monitoring tool, so last month we added Docker monitoring to SPM. We built it on top of spm-agent – the extensible framework for Node.js-based agents and ended up with spm-agent-docker.
Inside NGINX: How We Designed for Performance & Scale
NGINX leads the pack in web performance, and it’s all due to the way the software is designed. Whereas many web servers and application servers use a simple threaded or process-based architecture, NGINX stands out with a sophisticated event-driven architecture that enables it to scale to hundreds of thousands of concurrent connections on modern hardware.
Exponential Smoothing for Time Series Forecasting
Time series anomaly detection is a complicated problem with plenty of practical methods. It’s easy to find yourself getting lost in all of the topics it encompasses. Learning them is certainly an issue, but implementing them is often more complicated. A key element of anomaly detection is forecasting - taking what you know about a time series, either based on a model or its history, and making decisions about values that arrive later. You know how to do this already. Imagine someone asked you to forecast the prices for a certain stock, or the local temperature over the next few days. You could draw out your prediction, and chances are it’s a pretty good one. Your brain works amazingly well for problems like this, and our challenge is to try to get computers to do the same.
Socket Sharding in NGINX Release 1.9.1
NGINX release 1.9.1 introduces a new feature that enables use of the SO_REUSEPORT socket option, which is available in newer versions of many operating systems, including DragonFly BSD and Linux (kernel version 3.9 and later). This socket option allows multiple sockets to listen on the same IP address and port combination. The kernel then load balances incoming connections across the sockets.
Logstash Config File Organization
I’m currently at Elastic{ON} 15, which has been an amazing experience so far. Someone had the brilliant idea to set up an Apple-Style Genius bar where you just walk up and talk to someone from Elastic support. Sometimes you get contributors to the project your asking about too, it’s great.
Logstash Grok Speeds
Logstash and its Grok filter are excellent and I love them, but it was going so slow that the data was useless by the time I had finally ingested it to review it, here’s what was wrong and how I fixed it.

Network Engineering

Massive router leak causes internet slowdown
Earlier today a massive route leak initiated by Telekom Malaysia (AS4788) caused significant network problems for the global routing system. Primarily affected was Level3 (AS3549 – formerly known as Global Crossing) and their customers. Below are some of the details as we know them now.

Databases

How Partition Pruning work in Vertica
In Vertica we can take advantage of the « Partition Pruning » so we can speed up our queryes.
So what happens when we query a partitioned table in Vertica.

Web performances

Optimizing a Complex Site for Pagespeed II
In Part I of this series on optimizing websites for Pagespeed we optimized a complex site using mod_pagespeed, some manual tweaks and WPOptimize Speed by xTraffic. We boosted the mobile Pagespeed score from 58/100 to 86/100, and the desktop score from 75/100 to 93/100. In Part II of this series we’ll explore how to speed optimize the server response of a Wordpress site with caching plug-ins.

Management & organization

The Best Metrics for Cultural Change in DevOps Teams
Everyone wants to optimize their team’s performance, but coming up with a good plan for doing so isn’t always easy. That’s why operationally mature DevOps teams use metrics to gain valuable insight into their work, enhance the their capacity, and drive cultural change.
How I Visualize My Time Spent Programming
As a developer and CTO, I’m always looking for new ways to visualize my effectiveness as a programmer. I want to track things like « What was my most productive day of the week? » « What programming languages have I used the most? » « What files do I spend the most time in? »
6 Key Skills You Need to Build a Career in the Cloud: Part 1
We recently shared advice on becoming a cloud computing leader within your company (See 4 Steps to Becoming a Cloud Leader in Your Company.) Now we’re focusing on what it takes to build a cloud-first career. The idea is to make yourself invaluable to the growing numbers of organizations where cloud computing is becoming the default mode of operations. These firms need bona fide expertise to get the best business results out of their strategic technology investments in public, private, and hybrid cloud approaches.
6 Key Skills You Need to Build a Career in the Cloud: Part 2
Cloud computing clearly puts a lot of stress on the career plans for many IT professionals. Success in the cloud demands a whole new set of skills than the ones that helped advance careers in more traditional IT environments.
In Part One of this series, we looked at the technology-oriented skills required in the cloud. Here in Part Two we’ll check out the softer, more business-oriented skills sets, and delve into the critical role of metrics and analytics.
Defining innovation capacity, part 1: The innovation value of your technology stack
Today’s ideas can become tomorrow’s indispensable service. This doesn’t happen by magic. It takes a toolbox designed for innovative work.
For example, GitHub is a very successful service, the standard for code repositories. Developers provide their GitHub profiles to show their chops when they apply for new jobs.
Defining innovation capacity, part 2: Flexibility capacity
No one sets out to design for inflexibility. However, many organizations have discovered delay is built into their IT systems just as they were about to deploy new capabilities. Spinning up a virtual machine (VM) could take minutes or hours. Tying it into your development environment takes even more time. Integrating it with the rest of your infrastructure is complex and daunting.
Optimize Team Output, Not Individual Output
As teams and products grow, workflows and development processes get ingrained into the team. It starts with the early employees; processes are formed around them and the way they work. When engineering teams go from small to medium and then from medium to large, old workflows can become a large issue.
Those workflows are often still optimized for the small to medium-sized team, not to mention optimized specifically for those first employees. Not everyone who joined later is properly up to speed, nor do they necessarily have the means to tackle problems that are coming up. I touched on this exact problem recently when I wrote about using cloud services to increase productivity in teams.
Delegating Is Not Just for Managers
I remember most the tiredness that would come and stick around through the next day. After late nights where the effort had been successful, the tiredness was kind of a companion that had accompanied me through battle. After late nights of futility, it was a taunting adversary that wouldn’t go away. But whatever came, there was always tiredness.
Staying Sharp While Exercising
In my last blog post I explained how to reclaim your mornings and make them the most productive time of day. In this one I’ll explain how exercise makes my mornings better, and how I avoid feeling sluggish after overdoing it.
Boost Your Productivity In Three Easy Steps
If you’re like a lot of knowledge workers, you might feel that you spend your time unproductively. You seem to « do stuff » all day long but then feel that you’ve done nothing but « stuff » at the end of the day.
How can you change this? I’ve found three things that work for me to not only stay focused and achieve my objectives, but also help me feel better about myself. You see, although on a less focused day I might not feel very productive, it’s not that I’ve failed to achieve anything (though I might have achieved fewer of my most valuable goals). It’s that I’ve worked with an unclear mind, and later cannot remember exactly what I did during the day. This leads directly to self-doubt and self-criticism.