Firmware Over The Air (FOTA) for ESP8266 SoC

With the IoT booming nowadays, the number of connected devices grows exponentially and so does the related software that drives them. There is no doubt that Firmware Over The Air (FOTA) is a highly desirable – if not required – feature for any embedded project/product both DIY or commercial. Being able to provide a remote firmware update is obviously very beneficial. The opportunity here is to enhance product functionality, operational features and provide fixes for particular issues.  Updating the firmware OTA may eliminate the need to bring a product into a service center for a repair. Although not every issue can be resolved with a firmware update, if one is available for a particular issue, it can save a lot of time and money.
I already have a large number of connected devices running at home and in case I need to do a firmware update, it is a nightmare. I have to disconnect the respective device from whatever project setup they are attached to re-program and then re-install.  Not all embedded systems have the possibility to allow FOTA due to the complexity and required connectivity, for example none of my RFM12B/RFM69CW related projects have that feature. While possible by installing additional SPI flash (see as example Moteino wireless upgrade), it isn’t a real self-updating node and still requires quite a lot of skill to achieve, and only within the range of the transceivers (not over the Internet).

The ESP8266 SoC with its WiFi connectivity is well positioned to meet that need (FOTA), my WiFi relay project would greatly benefit from it. To achieve this, the available 512KB (on most ESP8266 breakouts) on the SPI flash memory is partitioned into two, and we run code from one or the other partition. Upon FOTA, the currently not used partition gets flashed and system restarted into it:

ESP8266 FOTA memory layout 512MB SPI flash size

ESP8266 FOTA memory layout 512KB SPI flash size

This solution limits the available user code to 236KB (256KB-4KB-16KB); For ESP-HTTPD based projects this would be just about the unmodified basic application. Obviously if you need to squeeze in more functionality, you’d need a larger SPI flash than 512KB. There are few off-the shelf ESP8266 breakout boards with > 512KB flash size; alternatively you could desolder and replace the SPI flash with a larger one.

Firmware update modes
I can identify two separate firmware update modes – externally pushed firmware file and another one that is initiated by the remote device on periodical or manual start basis.

Push firmware update

Push firmware update

Push firmware update

 

Pull firmware update

Pull firmware update

Pull firmware update

Externally pushed firmware update would be when the device receives the firmware file over a HTTP POST request and flashes it, whereas in the self-initiated firmware update mode the device would periodically check central firmware repository for firmware update, pull the firmware and reflash itself. The later can be initiated on schedule, or manually by user (just like your computer checks for updates vs how you can manually force-check). The first method is more convenient for development mode, where the device is connected to the local network. I find it quite useful to focus on developing the code without having to manually place the node in firmware update mode, connect FTDI programmer, bring it out of firmware update mode etc. Also the speed when flashing over network connection is quite good, job gets done in roughly 10 seconds. @TVE has done great work on developing this approach with his esp-link project. Note the Makefile magic that makes all that possible.

For devices in production it makes more sense to run in the second firmware update mode, where we have them check for new firmware proactively. That also addresses connectivity issues compared to previous method as the device initiates the process from behind the firewall (no need for fixed IP address, or port forwarding for outside visibility).

That’s pretty easy to perform with SDK’s in-built functions, the following can get you going in this direction:

static void ICACHE_FLASH_ATTR ota_finished_callback(void *arg)
{
 struct upgrade_server_info *update = arg;
 if (update->upgrade_flag == true)
 {
 os_printf("[OTA]success; rebooting!\n");
 system_upgrade_reboot();
 }
 else
 {
 os_printf("[OTA]failed!\n");
 }

 os_free(update->pespconn);
 os_free(update->url);
 os_free(update);
}

static void ICACHE_FLASH_ATTR handleUpgrade(uint8_t serverVersion, const char *server_ip, uint16_t port, const char *path)
{
 const char* file;
 uint8_t userBin = system_upgrade_userbin_check();
 switch (userBin)
 {
 case UPGRADE_FW_BIN1: file = "user2.bin"; break;
 case UPGRADE_FW_BIN2: file = "user1.bin"; break;
 default:
 os_printf("[OTA]Invalid userbin number!\n");
 return;
 }

 uint16_t version=1;
 if (serverVersion <= version)
 {
 os_printf("[OTA]No update. Server version:%d, local version %d\n", serverVersion, version);
 return;
 }

 os_printf("[OTA]Upgrade available version: %d\n", serverVersion);

 struct upgrade_server_info* update = (struct upgrade_server_info *)os_zalloc(sizeof(struct upgrade_server_info));
 update->pespconn = (struct espconn *)os_zalloc(sizeof(struct espconn));

 os_memcpy(update->ip, server_ip, 4);
 update->port = port;

 os_printf("[OTA]Server "IPSTR":%d. Path: %s%s\n", IP2STR(update->ip), update->port, path, file);

 update->check_cb = ota_finished_callback;
 update->check_times = 10000;
 update->url = (uint8 *)os_zalloc(512);

 os_sprintf((char*)update->url,
 "GET %s%s HTTP/1.1\r\n"
 "Host: "IPSTR":%d\r\n"
 "Connection: close\r\n"
 "\r\n",
 path, file, IP2STR(update->ip), update->port);

 if (system_upgrade_start(update) == false)
 {
 os_printf("[OTA]Could not start upgrade\n");

 os_free(update->pespconn);
 os_free(update->url);
 os_free(update);
 }
 else
 {
 os_printf("[OTA]Upgrading...\n");
 }
}

Here are the the two methods in action:

FOTA

Firmware update over the air

Security
In terms of security, well, it isn’t the most secure thing. Firmware gets downloaded/uploaded by unsecured HTTP channel and not much validation is performed to ensure the received file is what we think it is. I’m not sure what the overhead in terms of FLASH/RAM would be to enable SSL channel firmware upgrade. These concerns could be overcome by manually initiating an update (push or pull) and disabling the FOTA interfaces afterwards.

What could go wrong?
What can go wrong will go wrong eventually they say, so things must be handled properly. Interrupted download/upload, failed validation result in falling back to the previous firmware. The goal is obvious – to prevent a bricked device. While there is petty good protection against these, there is no protection against uploading buggy new firmware that bricks the device. Doing some testing, then some more testing and again testing before releasing new firmware should reduce the risk for that happening, and rolling it out in small portions also helps identifying a problem before many devices are affected.

Conclusion

I’m convinced the FOTA feature is a “must have” for any serious IoT project. I’m developing a FOTA capable (FOTA+ESPHTTPD+MQTT+SSL+FLASH CONFIG) platform that I will use for my future ESP8266 related projects.

 

 

Page views: 116428

12 thoughts on “Firmware Over The Air (FOTA) for ESP8266 SoC

  1. Nice article, thank you!
    …I think you should change the 512Mb to 512kB all over the text?!

  2. Nice write-up Martin! So far I’ve had quite some success upgrading man times in development without soft-bricking the devices. One reason it has worked reasonably well is that I push the upgrade to the device via curl as opposed to having a UI flow to pull it. The latter would require more pieces to work and would have left me stranded a few times…

    I’m also not thrilled about the security situation. If you have a trusted encrypted network, like your own home network, it’s not all that bad. I’ll see what happens when I turn on SSL… Fortunately there are newer devices coming with 4x the flash space, which will alleviate this issue.

    • You get to call it when needed. My other part of the code (not published here) checks firmware version on server and if greater than currently running calls it.

      • Hi
        I am getting this as error continuously.
        Fatal exception (0):
        epc1=0x40241788, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000000, depc=0x00000000

        I have setup a wamp server.This is the code:
        LOCAL void ICACHE_FLASH_ATTR
        user_esp_platform_upgrade_begin(struct espconn *pespconn,
        struct upgrade_server_info *server) {
        uint8 user_bin[9] = { 0 };

        server->pespconn = pespconn;
        server->port = 80;
        server->check_cb = user_esp_platform_upgrade_rsp;
        server->check_times = 120000;
        const char esp_server_ip[4] = { 192, 168, 0, 191 };
        os_memcpy(server->ip, esp_server_ip, 4);

        if (server->url == NULL) {
        server->url = (uint8 *) os_zalloc(512);
        }
        #if 0 // in general, we need to check user bin here
        if (system_upgrade_userbin_check() == UPGRADE_FW_BIN1) {
        os_memcpy(user_bin, “user2.bin”, 10);
        } else if (system_upgrade_userbin_check() == UPGRADE_FW_BIN2) {
        os_memcpy(user_bin, “user1.bin”, 10);
        }

        os_sprintf(server->url, “GET /%s HTTP/1.0\r\nHost: “IPSTR”:%d\r\n”pheadbuffer””,
        user_bin, IP2STR(server->ip),
        80);
        #else

        os_sprintf(server->url, “GET /%s HTTP/1.0\r\nHost: “IPSTR”:%d\r\n”pheadbuffer””,
        “download/file/user1.bin”, IP2STR(server->ip), 80);
        INFO(“%s\r\n”, server->url);

        #endif

        if (system_upgrade_start(server) == false) {

        ESP_DBG(“upgrade is already started\n”);
        }
        }

        • I say try to pinpoint the exact line you get the exception at and focus the effort to fix there. I do that by placing printfs “Got here..” around the code and seeing if it ever gets reached.

  3. Martin,

    You lnow much of wifi API is in the user binary (flash) ?

    I was wondering if we could just jump to RAM dynamically, with a small RAM firmware with wifi support (basic support). That way we could just erase whole of SPI flash and program it. Of course, any connection issue would cause system to never reboot properly.

    I still have some room, and may be able to manage splitting SPI in two chunks, I think:

    -rw-r–r– 1 alvieboy alvieboy 40048 Aug 20 17:43 0x00000.bin
    -rw-r–r– 1 alvieboy alvieboy 149840 Aug 20 17:43 0x40000.bin

    Alternative is to hook an additional SPI flash to my design (which should be simple), and use it for storing the temporary fw (and then use RAM-only code to flash the main flash). Would also be nice cause I can use it for other needed storage, and those chips are really cheap (20 cents or so for 8mbit).

    My design boosts an ESP8266 (ESP07) module and a small CPLD (Xilinx) in order to drive RGB panels through SPI port. It’s working well, seral programming (and console) both work, but I really could use OTA upgrades. I’m planning to have the design on sale by end of this year, but OTA needs to be in place by then.

    Alvie

    • Hi,
      FOTA isn’t entirely reliable, in fact I am seeing 70-80% success and have to re-run for the failed attempts. Network glitches/hiccups cause timing it out, so RAM based updates are in my opinion out of the question. I don’t see a problem splitting relatively large flash chips in two, that still leaves you with tons of space to use.
      I use 8mbit chips and run firmware close to 400K in size without issues

  4. Hi Martin,

    I got problem with FOTA, I use ESP07 1Mb with SDK 1.2.0 (I also try 1.1.2 and 1.3.0 and got the same result). I use your code with a minor modification about the file URL (I’m check Apache log and sure the board uses right url – “GET /upgrade/user1.1024.new.2.bin HTTP/1.1” 200 304240 “-” “-“).

    Here is the ESP8266 output:
    SDK ver: 1.3.0 compiled @ Aug 7 2015 19:17:30
    phy ver: 41201, pp ver: 9.0

    SDK version:1.3.0 rom 1
    mode : sta(18:fe:34:99:31:77)
    add if0
    f 0, scandone
    state: 0 -> 2 (b0)
    state: 2 -> 3 (0)
    state: 3 -> 5 (10)
    add 0
    aid 1
    pm open phy_2,type:2 0 0
    cnt

    connected with Vinh 1102, channel 7
    Wifi event: 0
    Free heap: 49976
    dhcp client start…
    Wifi event: 3
    [OTA]Upgrade available version: 2
    [OTA]Server 192.168.1.120:80. Path: /upgrade/user1.1024.new.2.bin
    URL: GET /upgrade/user1.1024.new.2.bin HTTP/1.1
    Host: 192.168.1.120:80
    Connection: close

    .
    system_upgrade_start
    upgrade_connect
    [OTA]Upgrading…
    Free heap: 48496
    ip:192.168.1.58,mask:255.255.255.0,gw:192.168.1.1
    upgrade_connect_cb
    GET /upgrade/user1.1024.new.2.bin HTTP/1.1
    Host: 192.168.1.120:80
    Connection: close

    sumlength = 304240
    �@*r��P���K�t�dr�.@��.�Ѫ�½�a /t�$v’

    Do you have any idea about this problem???

Comments are closed.