Once a month I scan my physical mail and receipts into Evernote, and then rename each of the new notes to something meaningful.

Before I rename them, the new notes have titles such as “CCE272012_00003.pdf” and “CCI282012_00029.jpg”:

image

I wondered whether it might be possible to write a program to automatically generate a title from the text that Evernote finds within the scanned image …

It turns out that it is possible.  These are exactly the same notes as shown above, with their new titles automatically generated from the text Evernote found in the scanned images.

image

This is just a proof-of-concept although if you are interested in using it then please let me know in comment to this post, so that I can look at making it available as a web-based utility.

How does it work?

If you are not a developer then this won’t mean much to you – feel free to bail now!

Essentially the program looks for all notes with a single resource which has recognition data, and which matches specific search criteria (for example in a specific notebook and with a specific title).  It concatenates the best-match words from the first couple of lines of text in the image, if the search-weight is above a certain level.

Here is the program (in C#):

using System;
using System.Linq;
using System.Text;
using System.Xml.Linq;
using Evernote.EDAM.NoteStore;
using Evernote.EDAM.Type;
using Evernote.EDAM.UserStore;
using Thrift.Protocol;
using Thrift.Transport;

namespace AutoTitleEvernote {
  internal class Program {
    // Get this from https://www.evernote.com/api/DeveloperToken.action
    private const string AuthToken = "...";

    // Change this as appropriate
    private const string SearchString = @"Notebook:Inbox intitle:CC*";
    private const int MaxNotes = 1000; // Max nr of notes we will process
    private const int LineFudge = 10; // Used to determine if words are on the same line 
    private const int Lines = 2; // How many lines of text to read
    private const int MinWeight = 50; // How good a match do we need on words
    private const string HostUrl = "https://www.evernote.com/edam/user";

    private static void Main() {

      var noteStoreClient = GetNoteStoreClient(AuthToken);

      // Find notes with the required title in the required notebook
      var filter = new NoteFilter { Words = SearchString, 
                                    Ascending = false, 
                                    Order = (int)NoteSortOrder.CREATED };
      var notes = noteStoreClient.findNotes(AuthToken, filter, 0, MaxNotes);

      // For each note with a single resource and recogition data
      foreach (var note in notes.Notes.Where(n => n.Resources != null &&
                                                  n.Resources.Count == 1 &&
                                                  n.Resources.First().Recognition != null)) {
        // Download and parse the recognition XML
        var recoXmlBytes = noteStoreClient.getResourceRecognition(AuthToken,
                                                                  note.Resources.First().Guid);
        var recoXml = XElement.Parse(Encoding.UTF8.GetString(recoXmlBytes));

        var items = recoXml.Elements("item").ToList();
        var title = new StringBuilder();

        var lineY = -1;
        var line = 0;

        // For each word
        foreach (var item in items) {
          // Keep track of the current line
          var y = int.Parse(item.Attributes("y").First().Value);
          if (lineY == -1) {
            lineY = y;
          }
          else {
            if (y > lineY + LineFudge) {
              if (++line > Lines) {
                break; // We've moved beyond the number of candidate lines
              }
              lineY = y;
            }
          }

          // Find the word's text and weight if the weight is above the criteria
          var word = (from t in item.Elements("t")
                      let weight = int.Parse(t.Attribute("w").Value)
                      orderby weight descending
                      where weight > MinWeight
                      select new { Weight = weight, Text = t.Value }).FirstOrDefault();

          if (word == null ||
              title.Length + word.Text.Length + 1 >=
                     Evernote.EDAM.Limits.Constants.EDAM_NOTE_TITLE_LEN_MAX) {
            break;
          }
          if (title.Length > 0) {
            title.Append(" ");
          }
          title.Append(word.Text);
          // title.Append("[" + word.Weight + "]");
        }
        Console.Out.Write("Rename " + note.Title + " to " + title + "? ");
        Console.Out.Flush();
        var input = Console.In.ReadLine();

        if (input == null || input.ToLower() != "y") {
          continue;
        }

        note.Title = title.ToString();
        noteStoreClient.updateNote(AuthToken, note);
      }

    }

    private static NoteStore.Client GetNoteStoreClient(string authToken) {
      var userStoreUrl = new Uri(HostUrl);
      var userStoreTransport = new THttpClient(userStoreUrl);
      var userStoreProtocol = new TBinaryProtocol(userStoreTransport);
      var userStore = new UserStore.Client(userStoreProtocol);

      var noteStoreUrl = userStore.getNoteStoreUrl(authToken);

      var noteStoreTransport = new THttpClient(new Uri(noteStoreUrl));
      var noteStoreProtocol = new TBinaryProtocol(noteStoreTransport);
      var noteStore = new NoteStore.Client(noteStoreProtocol);
      return noteStore;
    }
  }
}

image

A word about Evernote’s text recognition

One of the many cool things about Evernote is that it automatically finds text in scanned images and PDFs so that when you subsequently search for that text, you can find the corresponding notes.

Evernote’s text recognition is incredibly powerful. It finds text that is at an angle, hand-written, and in various languages.

It is powerful, but it is not designed to provide a text version of a scanned document. Instead it is designed to make searching for words work very well. For example for each part of an image where it detects a word, it has a series of words that might match, each with a “weight” indicating how good a match it thinks it is.

What I’m doing here is not really in line with Evernote’s text recognition capability’s goals … but it does work …