I've looked a bit into mixing languages with Xtext and liking Xtext and the CDT. Here are some preliminary findings:
1) It is possible to do "grammar mixins" in Xtext so it is theoretically possible to have the core grammar separate from the action language grammar, but...
2) the mixin is static so we would have separate mixin grammars for UMLRT+C++, UMLRT+Java, UMLRT+Alf, etc. This would require different file extensions for each such combination (which is how Xtext can determine which language it will be parsing).
3) There is no existing Xtext grammar C or C++, which is not very surprising because those are languages notoriously hard to parse (thanks to Ritchie and Stroustrup). I'm not even completely sure that Xtext's LL(*) parsers can deal with it (as C++'s grammar is context-dependent), and even if they can deal with it, the use of preprocessing directives becomes incredibly difficult to tackle. However...
4) in the code generator we already have an intermediate meta-model for a significant subset of C++. This could be transformed into an EMF meta-model and an Xtext grammar generated from it.
5) Some people have looked into integrating the CDT and Xtext, but going in the opposite direction: embedding Xtext miniDSLs in C++ code. These seem to have had limited success.
6) Some people have worked on extending the CDT to make it "multilingual". But it seems to require modifications to the CDT itself, and the use of the CDT's own AST or something based on it, which will not be appropriate for many languages, including UML-RT.
7) It is possible to override the behaviour of Xtext-generated outline, formatter, content-assist, quick-fixes, etc. Hence we could delegate those activities to someone else, but it is not clear to me how to delegate to the CDT (or JDT), if at all possible.