Sunday, February 16, 2014

C++14 is here

Today we finished the Issaquah meeting from the ISO C++ Standards Committee.


C++14 International Standard

The most relevant news is that we have approved C++14 for DIS. Formally this means that we send the document to ISO for a final ballot. There is an unlikely possibility that and additional ballot is needed. In practice, it means that C++14 is ready. There is a huge possibility that it is published during 2014 and a small possibility that is published in early 2015. So, this time we got it for the year we were expecting to. Wow!

More on C++ Technical Specifications

We also approved the initial drafts for some technical specifications. In particular:
  • A working paper of the Concepts Lite technical specification. In essence, this will allow to constrain templates with predicates. When concepts are out there, you will get (finally) meaningful error messages for most generic code.
  • A working paper for the Parallelism technical specification.
  • A working paper for the Concurrency technical specification.
We decided to have one more technical specification on Transactional Memory.

A significant progress has been made for the Library Fundamentals technical specification.

Besides, the File System technical specification has gone through the resolution of many national body comments. It is highly likely that we will have a final ballot of this technical specification after the next committee meeting (to be held in June in Switzerland).

C++14 last minute modifications

You may be aware of the status of C++14, so here you will only find an incomplete list of changes.

First of all, many corrections from defect reports and comments issued by National Bodies have been made, both to the library and to the language. In this post I do not enumerate those minor changes.

Perhaps, you already have seen the excelent explanation from Stephan T. Lavavej about why we should not use the rand() function. If you didn't, please have a look at this video. C++14 is not deprecating rand() but it will contain a strong note discouraging the use of rand() because of its questionable quality and performance.

A new metafunction alias called tuple_element_t has been introduced. This allows a simpler writing when extracting a type from a tuple, and

typename tuple_element<I, tuple<T...>>::type

becomes

tuple_element_t<I,tuple<T...>>

There are also a number of chages related to concurrency in C++14:

  • shared_timed_mutex is the new name for the old shared_mutex. This makes naming of mutexes more consistent and leaves room for a future extension.
  • A set of clarifications has been introduced about the interactions of a signal handler and threads that are running.
  • The concept of lock-free execution has also been clarified.

The Library Fundamentals Technical Specification

A collection of components has been incorporated in the Library Fundamentals TS. Some of them are:
    • string_view: A ligthweight type to reference a sequence of inmutable char-like objects. Simliar types have been previously developed by Google, LLVM, and Bloomberg, justifying the need for standardization.
    • any: A type safe container for a single value. This class is heavily based on boost::any.
    • A set of facilities for allocators that make easy that allocator is not part of types.
    • Faster string search algorithms (Boyer-Moore and Boyer-Moore-Horspool).
    • sample(): An algorithm for randomly sampling elements from a sequence.
    • A set of variable templates that allow a simple way of spelling type traits (is_arithmetic_v<T> instead of is_arithmetic<T>::value).
    • apply(): Allows applyinf an arbitrary function to a tuple.
    • A reformulation of iterator_traits so that the do not generate a hard error when its template argument does not have the expected member types. This allows to use this trait in an SFINAE context.
    • A reformulation of the type trait common_type to avoid a hard error when there is no common type. This allows to use this trait in an SFINAE context.
    • shared_ptr extensions to support arrays (this allows both shared_ptr<T[]> and shared_ptr<T[N]>).
    • A library solution for performing byte ordering conversions (through hton<T>(T x) and ntoh<T>(T x)). These functions were originally planned for the networking technical specification, but have been finally moved here.
    • optional: A class to support types that may or may not hold a value. This type is highly influenced by boost::opetional.
    • invocation_type and raw_invocation_type: Type traits for callable objects.

The Parallelism Technical Specification

An initial working draft has been approved. This technical specification contains a library interface for a set of algorithms that can be excuted under different execution policies (including sequential, parallel, vector).

The Concurrency Technical Specification

An initial working draft has been approved.

The techcnical specification will include executors, which are objects that can execute unit of work in the form of function objects. The concept of thread pool is included here.

Besides, this TS also includes improvements to std::future.




Tuesday, July 2, 2013

Passing arguments to std::thread

The C++11 standard library has a powerful mecanism for managing threads: class std::thread, which follows the principle of providing a simple interface for doing simple things.

Let's start with a very simple and useless example:


void f() {
  do_some_work();
}

void g() {
  do_some_more_work();
}

void h() {
  std::thread t1{f}; // Launch f in a thread
  std::thread t2{g}; // Launch g in a thread

  t1.join(); // Blocks until t1 finishes
  t2.join(); // Blocks until t2 finishes
}

I say it is useless, beause for most uses of concurrency one needs to send arguments to the threads. However it is simple to follow. Function h() launches two threads: t1 and t2. Thread t1 runs function f() and thread t2 runs function g().

Now let's add to our example arguments to be passed to our threads.

void f(int n) {
  for (int i=0; i<n; ++i) {
    do_some_work();
  }
}

void g(int a, int b) {
  for (int i=a; i<=b; ++i) {
    do_some_more_work();
  }
}

void h() {
  std::thread t1{f, 10}; // Launch f(10) in a thread
  std::thread t2{g, 15, 25}; // Launch g(15,25) in a thread

  t1.join(); // Blocks until t1 finishes
  t2.join(); // Blocks until t2 finishes
}

Now, we can launch threads with function calls to functions with any number of arguments in a simple way. At this point, you may remember how long you spent doing type casts with your old good pthreads or similar API.

Looks wonderful, isn't it? Well, now is when the tricky part comes into scene.

What is the real signature of std::thread constructor?

template <typename F, typename ... Args> 
explicit thread(F&& f, Args&&... args);

What area this ellipsis in this template constructor? This is a variadic template, which is a C++11 mechanism to provide a template with a variable number of template arguments. You can see this declaration as the following family of declarations:

template <typename F, typename A1> 
explicit thread(F&& f, A1&& a1);

template <typename F, typename A1, typename A2> 
explicit thread(F&& f, A1&& a1, A2&& a2);

template <typename F, typename A1, typename A2, typename A3> 
explicit thread(F&& f, A1&& a1, A2&& a2, A3&& a3);

And so on.

If you are not (still) quite used to C++11, you may be thinking about what are those double ampersands. You probably already know that, in C++ there is no reference-to-reference. So, what is this? In general, a double ampersand denotes an r-value reference. However, in combination with templates they have some special rules. Scott Meyers named this combination a universal reference. In practice, this means that if you pass an l-value (e.g. a variable) you ar passing by reference, but if you pass an r-value (e.g. an expression) you are passing by value.

With this in mind, we can pass variables to a thread:

void f(int n) {
  for (int i=0; i<n; ++i) {
    do_some_work();
  }
}

void g(int a, int b) {
  for (int i=a; i<=b; ++i) {
    do_some_more_work();
  }
}

void h() {
  int x = 10, y = 15, z = 25;
  std::thread t1{f, x}; // Launch f(x) in a thread
  std::thread t2{g, y, z}; // Launch g(y,z) in a thread

  t1.join(); // Blocks until t1 finishes
  t2.join(); // Blocks until t2 finishes
}

Then. Can I use functions tanking arguments by reference? Well, yes and no. I mean, it is not so easy.

Let's start with a broken example:

void f(int & n) {
  ++n;
}

void g() {
  int x =10;
  std::thread t1(f, x);
  t1.join();
  std::cout << x << std::endl;
}

If you compile that piece of code, you may find that you get a compiler error. What is the problem?

The standard requires that a copy of your parameter is passed to the new thread. Thus it will only work with parameters either by value or by const reference, but not by reference.

Magically, the solution comes with std::ref() (and its colleague std::cref). Both are means for making a reference_wrapper from a variable.

void f(int & n) {
  ++n;
}

void g() {
  int x =10;
  std::thread t1(f, std::ref(x));
  t1.join();
  std::cout << x << std::endl;
}

So, remember it. If you want to pass something by reference to a thread, you need to use std::ref(). If you want to pass something by const reference to a thread, you need to use std::cref().

However, in most cases there are better approaches than passing references to threads. Please, think if you absolutely need to pass a reference to a thread before doing so. They could be a source for data races or lead to strange dangling references if the referenced value happens to be destroyed. In summary, you should be very sure of what you are doing before passing by reference to a thread.

Next time, I will try to write something about returning values frome threads.

Wednesday, June 26, 2013

Un paseo por C++11

Ayer tuve el placer de compartir con la tarde con un buen número de desarrolladores C++.

Amablemente, la empresa Indizen, organizó un evento al que asistieron, además de sus propios desarrolladores, otros de varias entidades financieras. Ya sabía yo, que en el sector financiero C++ es un lenguaje muy usado, pero no me esperaba tal éxito de asitencia. Durante dos horas un buen número de desarrolladores soportaron mi presentación sobre C++11 y todavía tuvieron energí para comenzar un turno de preguntas cuando eran las nueve de la noche.

Posteriormente pudimos compartir un vino español, también por gentileza de Indizen. Les estoy muy agradecido por una excelente organización, así como por la elección de un lugar tan agradable como el museo del ferrocarril.

Evidentemente, en dos horas solamente se pueden dar unas pinceladas de C++11, pero creo que la audiencia pudo obtener una visión general de los cambios. Y más de uno dijo en español, aquello que yo ya había oido a Herb Sutter (C++ feels like a new language). Y realmente lo es.

Comenzamos la presentación dando una visión general del propio proceso de normalización del lenguaje y poniéndolo en contexto.

Continuamos viendo algunas características que hacen C++11 algo más fácil de aprender y enseñar. Por supuesto, esto de la facilidad es opinable, pero hablamos de la nueva sintaxis de iniciación uniforme, la inferencia de tipos, los bucles basados en rango, las expresiones constantes, y los literales definidos por el usuario. Tratándose de desarrolladores para el sector financiero, y sabiendo que en esos entornos las simulaciones estocásticas en general (y las de Monte Carlo en particular) son una práctica habitual, no me pude resistir a comentar algo sobre la nueva biblioteca para generación de números y distribuciones aleatorias.

Dediqué otro bloque de la charla al patrón por excelencia en C++: RAII. Recordamos como el uso sistemático de este patrón es una buena receta para evitar los problemas de goteos de memoria en particular y de recursos en general. De aquí, pasamos a ver como la semántica de movimiento contribuye a simplificar este enfoque.

En cuanto a genericidad vimos las ventajas que introducen el uso de expresiones lambda y cómo esto permite hacer uso de los algoritmos genéricos de la STL de una forma más simple. Vimos algún uso de las plantillas con número variable de parámetros. También vimos ejemplos del uso de los alias de plantillas. Finalmente realizamos una pequeña introducción al modelo de concurrencia de C++11.

Apreveché la ocasión para presentar resultados concretos de como se puede hacer uso de nuevas prestaciones de C++11 para optimizar el rendimiento de aplicaciones y los resultados obtenidos en un caso concreto.

Finalmente, anuncié la celebración de un nuevo evento: el Dia C++, que celebraremos el próximo octubre ( noviembre) de 2013. Además, presenté el contenido de un curso de C++11 para desarrolladores C++. Es un curso intensivo de 20 horas que el grupo ARCOS ya ha impartido el pasado mayo como curso de formación interno en una empresa del sector de las telecomunicaciones.

Si tenéis interés, las transparencias están disponibles en:

https://sites.google.com/site/aviewoncomp/cpp11.pdf

Friday, June 7, 2013

Post-Doc position available at ARCOS research group



Yes. We are hiring!

I am glad to publicly announce that our REPARA proposal for an FP7-ICT project has been accepted and we will start next September. We are really very excited with this. REPARA is a 3 years project involving 4 academic partners and 2 industrial partners from (Germany, Hungary, Israel, Switzerland, and Spain) which will address real challenges on adapting C++ software for heterogeneous parallel computers.

Below you will find a preview of our job offer. If you are interested and you need additional information, please, make me know. The official job offer will be available in a few days.

Post-Doc Position: REPARA FP7-ICT Project

The ARCOS research group (University Carlos III of Madrid, Spain) is opening a Post-Doc position in the context of the FP7 project REPARA (Reengineering and Enabling Performance and poweR of Applications) which will start on September 2013 with a duration of 3 years.

The REPARA project aims to help the transformation and deployment of new and legacy applications in parallel heterogeneous computing architectures while maintaining a balance between application performance, energy efficiency and source code maintainability. To achieve this goal REPARA will combine multiple approaches for refactoring C++ source code to target multiple programming models for parallel heterogeneous architectures.

In this context we are looking for a researcher to help us in the development of language extensions for C++ compilers that can be used for semantic annotation of generic software components. In particular, tasks include

  1. To define mechanisms to restrict generic algorithms to type subsets, as well as techniques for algorithms overloading on type subsets. 
  2. To define mechanisms for specifying the semantic properties of a library in generic terms.
  3. To apply these techniques for specifying libraries identified by use cases to provide multiple implementations for specific processing elements.

The applicant should hold a PhD in Computer Science or Computer Engineering with a strong background in compiler design. A strong knowledge of the C++ programming language is required. Previous experience in development of compiler technologies with gcc or clang are a plus. Candidates should be fluent in English (written and spoken) and show their ability for team-working.

For more information, contact Prof. J. Daniel Garcia at josedaniel.garcia@uc3m.es

Monday, April 22, 2013

C++14: A different view on a couple of issues

I'm back from the C++ meeting and Bristol. And the great news as you probably may know is that we will be issuing a new CD (Committee Draft) for a new version of the C++ standard. That is we hope to have something called C++14.

We have had a huge success in terms of participation in the standard meeting with one hundred people working during a six days week starting at 8:30 and with after-dinner evening sessions. Really impressive work from many people. For further information on the C++ Committe you can get a lot of useful information at isocpp.org.

Other committe members write excellent reports on every meeting, and I always recommend to read the really good trip reports from Herb Sutter. Here you have the latest one.

So I thought I could right a different view on a couple of tiny details. I will give my very personal view on two issues one that got accepted (generic lambdas) and one that got rejected (digit separators).

Concepts lite and generic lambdas

The committee decided that concepts lite required to be in a separate Technical Specification (TS) and not in the standard to give opportunity for fruther experimentation. I tend to agree with this view, although there is a working implementation.

However, the committe decided to put generic lambdas in the International Standard for C++14. I like generic lambdas a lot. They are great! I have already felt the need of generic lambdas in some of my current implementations. But I feel that the safest path would be to put them in the same TS than concepts lite.

Although I know it is unlikely, it could happen that later in the process we could find that we wanted a tighter integration of generic lambdas and concepts. I hope we do not regret of our dicision here. I agree that the risk is low, but it is still a risk.

Moreover, I think that having generic lambdas in a TS rather than in the IS, would be the safest path, and implementors still may deliver their products with TS implementations.

The recurring digit separators discussion

We discussed a lot of options about digit separators. The idea behind is to make easier writing long numeric literals so that they are more readable and code is easier to maintain.

Today if you want to use the speed of light in your code you do something like:

constexpr double speed_of_light = 299792458;

With the digit separator proposal that could be written:

constexpr double speed_of_light = 299_792_458;

Unfortunately this proposal did not pass, because there was an argument about its interaction with UDL (user defined literals). The case that arised during the discussion was:

x = 0xFF_EB; // 255 Exabytes or 0xFFEB?

I do not think that writing Exabytes in hexadecimal is so common to justify not making a service to user communities that use long literal numeric constants. But I may be wrong.





Wednesday, January 16, 2013

The harms of Java

I have always considered a tremendous error using Java as first language for teaching programming to Computer Science and Engineering students at University. I have a bunch of reasons for this. How
ever, many schools do it, including my own University.

Some day I may write something about this, but I am afraid that I would need more space than a plain blog post.

I have used Java in industrial projects. The most common reason has been customer requirement to do so. In those cases I have always warned customers of the dangers of such a decision.

Today I was thinking a little bit about the latest Java vulenrability and how Oracle has been managing it. I was reading this post from CERT at Carnegie Mellon reccomending to disable java from your browser now.

This has brought to my mind two principles that I think are very important when making a selection on a language. Both can be mainly summarized in: "Always try to avoid single way of thinking".

Prefer standardized programming languages

I think that standardized programming languages should be preferred over others. Specifically I am thinking in programming languages with an International Standard. The reason behind is openness.

Many argue that design by committe is not the best way of designing a language. I may partially buy that argument. However, I think is much better, that the model where a single company designs a programming language and makes changes (sometimes incompatible changes) when they want. 

At least, any specification going through a standardization process involves industry providing the technology (i.e. compiler providers), academics and end users. It is true that sometimes some group may be underrepresented in a standards committee, but there are clear rules on how to become a member of a committee.

Anybody can implement an environment for a standard programming languages. Just have a look on the number of different implementations for languages like C, C++, Ada, COBOL or FORTRAN. However, in the recent years we have seen lawsuits about who has what right with a language owned by a company.

Prefer compiled languages over virtual machine based languages

My usual argument for preferring truly compiled languages has always been performance. I know. Not everybody is concerned about performance but I am.

However, security and safety emerge as another strong reason. We have seen how recent updates on Java Virtual Machines have made that usable applications have become unsecure applications without any intervention of the developer. This has brouth the new acrony WOBE (Write Once Break Everywhere) as opposed to the promised WORE (Write Once Run Everywhere).

In contrast, with a truly compiled application where you deploy binaries, it is much more difficult that such a thing happens. I will not say impossible, because "impossible is nothing". But at least, very difficult.

Is there room for Java-like languages?

Sure there it is. Java has proven to be excellent in combination with application servers. You can get cheap programmers to do so. Hey. Wait a minute! That is true as long as you are not concerned with energy efficiency. Surely this has a huge environment impact for data centers and for the life of your battery.

Then what. Well, I really don't know.

NOTICE: I am happy to get comments for this posts. However I will remove any comment that I consider offensive or is not supported by some sort of reasoning.