Friday, February 7, 2014

kate: intelligent code completion for all languages!

... well, maybe that's a bit of an exaggeration, but it's certainly much more intelligent than before. Look:

Code completion in CSS
... bash
... Lua
... PHP
even Gnuplot!
Note how this one has a different set of possible items for the same query, respecting the context.
Even Mathematica ;) This image shows a problem which still needs to be fixed: in case-insensitive languages, all completion suggestions are lowercased (which is not technically wrong of course, but a  bit ugly). It's easy to fix but simply not done yet.
There's unexpected profit from this in quite some areas, even KDevelop: for example, through this we now get code completion for all keywords in doxygen comments:
Completion for doxygen keywords inside a doxygen comment
Of course, those only appear inside actual doxygen, and not C++. When the cursor is in C++ code, it shows the C++ keywords instead (but they will not be very visible in KDevelop, since they're sorted below KDevelop's suggestions, which are better).

How does this work?

Short answer: magic! Correct answer: it uses the highlighting files. For highlighting, kate has a list of possible keywords for languages listed in highlighting files (/usr/share/apps/katepart/snytax/$language,xml). Those keywords are even context-sensitive: you will notice that e.g. the PHP highlighter does not highlight PHP function names inside comments or strings. So, the highlighting engine needs to know which keywords are valid at which position. Those are precisely the keywords which are suggested in the list.

What now?

Now that we have this feature, I think we can make more out of it in quite some cases. Especially, I want to invite you to have a look at your favourite language, and make sure all keywords / builtin functions / etc. are actually listed. Because of this feature, it might make sense to list keywords for languages where they are not terribly helpful for highlighting; a prominent example would be HTML, where currently the highlighter is totally generic and does not actually look at e.g. the tag names (thus, there's no completion). If you'd fix that by actually listing all valid HTML tag names, you'd (1) get better highlighting, e.g. you can mark undefined elements (think typos) as errors and (2) completion for free with that.

Another thing which can be improved is the context sensitivity. Some languages already do this rather well, but many languages will higlight keywords also in places where it'd be easy to detect that the keyword does not make sense there. That doesn't matter that much for highlighting only, because generally users write code which makes sense, but still -- if you can detect it, both consumers of the highlighting data (the actual highlighting, and the completion engine) gain something from it. So, extra motivation for making things more exact! ;)

I'm sure we can do more cool stuff with this. If you can come up with a good idea -- tell me, I'm happy to talk about it.

12 comments:

  1. After looking at the highlighting files, I always wondered why keyword completion wasn't augmented by the values in there based on context. I'm happy to see that it actually got added. Is there a table some place that lists all of the current syntax files and how "cmplete" they are? It'd be nice to be able to see what's done and what needs to be done for each one instead of having to look through each of the files individually.

    ReplyDelete
    Replies
    1. There isn't, but it's a good idea to have something like that -- a wiki page sounds like a good place to start such a table, maybe. :)

      Delete
  2. Awesome work!
    I have an Idea. Is there a way we can use the key words to call up documentation of some sort. I think it was a feature that I saw in "bluefish," a gtk html editor a long time ago. For instance, when I type in the "<p>" tag in the code page, the documentation frame, when enabled (Kate/KDevelop), can show something like http://web.archive.org/web/20130301022357/http://learningforlife.fsu.edu/webmaster/references/xhtml/tags/text/p.cfm.

    I am not sure if what I have just recommended makes sense. but it would be really cool to implements something like that for html and php using php.net and any other available documentation engine. Whether or not the feature should be an online only feature or offline feature is up to you.

    ReplyDelete
  3. Certainly an interesting idea.

    Presenting the information itself is not a problem. The completion model has an "extra item details" mode which is triggered by pressing Alt with an item selected. It's just not used by kate itself currently (but KDevelop uses it).

    I don't think we can ship full documentation. Besides the size, which might be hundreds of megabytes (just a guess, I don't know), it will be very hard to keep it up-to-date and might even have copyright issues in some cases.

    Also, what information to display? The page you have there is way too spread-out in its formatting to be displayed in a completion item. The text you want to display in there is a plain, simple docstring with a two-sentence description, not a complete manual page (that should be a separate plugin).
    Instead, the information from the page (valid sub-elements, valid attributes) should imo be integrated into the syntax file itself, such that it knows which attributes and child elements are valid in the current context. That will automatically restrict completion to those items and it will also enable you to mark wrong attributes e.g. in a different color.

    Maybe both issues could be solved by having a "description=" attribute in the item lists, which contains such a docstring. That could be generated from some kind of online docs depending on the language. For more extensive documentation, I think a separate plugin is the way to go (see KDevelop's documentation plugin).

    ReplyDelete
    Replies
    1. Makes sense: "Instead, the information from the page (valid sub-elements, valid attributes) should imo be integrated into the syntax file itself, such that it knows which attributes and child elements are valid in the current context. That will automatically restrict completion to those items and it will also enable you to mark wrong attributes e.g. in a different color."

      I am assuming that docstring in "Maybe both issues could be solved by having a "description=" attribute in the item lists, which contains such a docstring. That could be generated from some kind of online docs depending on the language." the "simple docstring with a two-sentence description" from the previous paragraph.

      Sounds good to me. Keep up the good work

      Delete
    2. Yes, that was the idea.

      Oh, huh, and then comes the translation issue. We can't throw megabytes of docs for random languages at the translators ...
      I'm unsure about this. It's certainly not the lowest-hanging fruit out there, and maybe we should first concentrate on making better use of the existing features in the syntax files. :)

      Delete
  4. A masterpiece of work :-) Really cool patch

    ReplyDelete
  5. My Kate don't have this intelligent code completion. I use Kate 3.11.5 in KDE 4.11.5. Which version do you use?

    ReplyDelete
    Replies
    1. This feature is new and not yet in any released version of kate. You'll have to wait for the next one.

      Delete
    2. Ok, thanks. Can't wait for next release.

      Delete
  6. Thanks scummos, it will be really great to use it in Cantor scripts editor.
    Will this feature be available for Kate and to KTextEditor/Kate-part too?

    Best regards;

    ReplyDelete
    Replies
    1. Yes, it's implemented in the part. It will work for all applications based on that.

      Delete