6
\$\begingroup\$

Part number: 24FC256

I am using an EEPROM to store some configurations for the MCU logic that must be retained if a power down condition occurs. Also, I store events into the EEPROM so that I don't lose them.

I would like to extend the EEPROM life endurance as much as I can and to achieve that I have thought in some techniques.

  1. Store binary value

I have some configuration variables that are boolean values. So writing 0x00 and 0x01 to a byte over and over again doesn't look very efficient. To accomplish this a better wear leveling, I have designed the following strategy:

Starting from the 0xFF value (0b11111111) I would consider the value as a '0' if the first '1' of the value is in an odd position. So in this case, the first '1' is in the 7th position (most significant bit first). That means that the value 0xFF encodes a '0'.

Now I want to change the value from '0' to '1'. So I write a 0x7F value to the EEPROM (0b01111111). Now the first '1' is in an even position, so that means that the value stores a '1'.

Once the value is 0b00000001, the next value written is 0b11111111 and the process starts again.

I think that the idea is pretty clear.

My question is, does this strategy improve x8 the life span of that EEPROM byte? As I understand it, what causes a cell to wear out is erasing it (converting 0 to 1), but the writing process (converting 1 to 0) does not involve wearing it out. And as I perform 8 write cycles before erasing it again I reduce erase cycles to an 1/8th.

I want to beleave that the EEPROM is smart enough to not perform an erase before the write if there is no 0 -> 1 transition from the old value to the new one.

  1. Store data arrays

Another task to accomplish is to store events in the EEPROM. Say I want to store the unix timestamp when an event has happened. Having a circular buffer of 32 elements of 4 bytes each.

Once an event has been sent, I would like to "remove" them from the buffer. To do that, I have thought to write the 4 byte element to 0x00000000. Other approach would be to erase them to 0xFFFFFFFF. When a new event happens, it is written in the previously removed element.

Would this strategy involve two erase/write cycles? (First to remove the value and second to write the new one) Or it performs a write in the case of 0x00000000 (which doesn't wear out) so this operation wouldn't count? Or it performs an erase to turn the value to all 0xFF, so the next time only has to write?

All these questions arose from the doubt about what exactly causes the cell wear out, if it is just the erase step or not.

\$\endgroup\$
12
  • 1
    \$\begingroup\$ It's an EEPROM but you never say how it works. Is it internal to some MCU, or an external chip? Which MCU or chip it is? Generally speaking, EEPROM is always erased before writing it, and it's FLASH that can be programmed without erasing in between. Different MCUs have different internal EEPROMs anyway, they could work in erase groups of 4 bytes even if you can write one byte at a time. Unless you mean FLASH. \$\endgroup\$ Commented Apr 3 at 8:03
  • \$\begingroup\$ Do you really need to reinvent the wheel? Some people have done it github.com/search?q=eeprom%20wear&type=repositories \$\endgroup\$ Commented Apr 3 at 8:10
  • 2
    \$\begingroup\$ Have you considered Ferroelectric RAM? "...maximum read/write endurance (about 10^10 to 10^15 cycles)." \$\endgroup\$ Commented Apr 3 at 8:12
  • 1
    \$\begingroup\$ A large portion of the question is senseless without knowing if 0xFF or 0x00 corresponds to an erased cell - you never say. \$\endgroup\$ Commented Apr 3 at 8:25
  • \$\begingroup\$ @Justme Sorry, I wrote the part numbers but after editing the comment it seems that I have deleted it. It is external, the part number is 24FC256. \$\endgroup\$ Commented Apr 3 at 10:00

2 Answers 2

6
\$\begingroup\$

What causes wear isn't so much the writes as the erases. Though they go hand in hand - in order to write you first need to erase, since you can only write by flipping a bit from the erased default value to the other value, never the other way around.

Therefore the typical wear leveling algorithm works on chunks of the smallest possible erase/write size. Lets for example say that you have 512 bytes of EEPROM (in this millennium, it is probably data flash emulated EEPROM) and the smallest erase/write size is 8 bytes. You will then have to arrange your data in 8 byte segment chunks including checksum. There are 512/8=64 segments available. For wear leveling it is impossible to have one single checksum at the end of the memory, you'll have to put it together with each segment - and yes that's quite inefficient.

  • The program starts by having the EEPROM erased + written to with default values either as part of the linker generated machine code downloaded in on-chip memory, or otherwise as a bootloader or other such driver. Even though the erase size may be smaller, the whole memory needs to be erased to keep track of which parts that contain valid data.
  • The used data now sits at the least significant address and is marked as programmed by having one or more bits in the non-erased state. I.e. if the erased state is all ones then the data must be guaranteed to contain at least one zero, ideally at a reserved bit position.
  • The used data occupies n segments out of the 64 available. 64 ought to be a multiple of n or otherwise the algorithm turns needlessly complex.
  • Whenever data needs to be changed, a new chunk of n segments is stored in RAM and modified.
  • The new n segments of data are written to the EEPROM just past where the previous n segments were stored. Since the EEPROM is already erased, writing is fine. The old data still sits at the least significant address but is now outdated and irrelevant. The new relevant data is stored at least significant address + n segments.
  • When the program uses the data, it does a linear search from the least significant address and upwards, until it encounter the start of a n segments section where the expected "programmed" bit was not zero, i.e. an erased part. The n segments before that one is the valid data to be used.
  • If no erased chunk of n segments was found, the whole memory is full and an erase needs to be performed.
  • During erase, the valid data needs to be stored in RAM. Care about power-down happening during this time should be taken. Often by picking a large enough bulk cap on the MCU supply to keep it up for at least the specified erase + write time.
  • After erase, the data is stored in the least significant address and it all starts over.

Observations:

  • Wear leveling occupies a lot of memory so it is really wasteful and not really feasible unless you only need to store small amounts of data. Typical applications are logging or some kind of meter that remembers the last value/setting.

  • Access times will not be random access, but rather grow the more you fill up the memory. On the other hand, as soon as you've found the valid data, you can store a pointer to that address which only needs to be changed the next time you write something to the EEPROM.

  • You need to calculate how long the memory will last by checking the maximum number of erase cycles specified by the manufacturer, how often you write and how many segments you've divided the memory into. If you have divided the memory in 64 segments and one segment is enough to contain all data, then naturally the EEPROM will last 64 times longer than it would witout wear leveling.

    When calculating life time, keep the MCU clock inaccuracy in mind.

  • Implementing this whole thing is tiresome and time-consuming. I've done it many times and it's a lot of busy-work and testing which could be better spent on something else. So maybe consider technologies like FRAM/MRAM to save you from this, especially if dev time/time to market is more important than BoM costs.

\$\endgroup\$
3
  • \$\begingroup\$ Just use a FRAM and get it right. It uses less power at the critical power down time, has way more erase cycles and runs faster with no polling to see if it's done. \$\endgroup\$ Commented Apr 3 at 10:53
  • 3
    \$\begingroup\$ There should also be a reserved bit for "this block has been erased", and ideally the status bits are duplicated, so you would erase a block, then on success set the first byte to 0x7f, then to 0x3f. If you read the first byte as 0xff, then it is likely that the erase was interrupted and needs to be repeated. If the first byte reads 0x7f, then marking the block as erased was interrupted, and the byte needs to be written as 0x7f, and subsequently as 0x3f again. Any other value larger than 0x3f indicates an error. \$\endgroup\$ Commented Apr 3 at 10:59
  • \$\begingroup\$ @SimonRichter: Information about a block needing to be erased and having been successfully erased should be stored outside that block, in such a way that if power is lost while a block is being erased, it will be possible to determine which block was being erased without having to care about what it seems to hold, though if a part lacks any means of performing separate write and erase operations ensuring robust operation may not be 100% possible. \$\endgroup\$ Commented Apr 3 at 22:24
6
\$\begingroup\$

You are using the 24FC256.

It has a 64-byte page buffer, and rated for more than 1 million erase/write cycles. Per page.

It reads in the data sheet that even if you want to update a one single byte, it forces the entire page to be erased and rewritten.

Now you might want to consider a ballpark number if you even need to think about wear leveling.

If your config settings all fit into a 64-byte page, and you change and store settings once per hour, each day and night for every day in a year, the 1 million writes takes 114 years to achieve. Granted, I did not account for leap years, but safe to say, rather irrelevant.

The event logging might be best done in a round-robin/ring-buffer fashion, but again if events are shorter than 64 bytes then for example two events of 32 bytes written contiguously takes two erase/write cycles. But there are 512 pages of 64 bytes, so you can write 512 million events of 64 bytes before EEPROM wears out. Or 256 million events of 32 bytes. If an event of 64 bytes is written once per second, all 512 pages have suffered 1 million erase/write cycles after 14 years.

So the best way to do wear leveling is to write only full 64-byte pages when there is a need to write anything, and keep track of newest and oldest page so oldest can be overwritten.

\$\endgroup\$
4
  • \$\begingroup\$ The events are 8 byte large, which now I see that each page would have only 1/8th of a 1 Million cycles. Other approach is using a single page of 64 bytes for 8 bytes of date, which gives 1Million cycles but reduces the available size to an 1/8th. (Or any intermediate number of events per page). \$\endgroup\$ Commented Apr 3 at 10:50
  • \$\begingroup\$ @Gorka3 Or use multiple EEPROM chips with smaller page size. Or Flash instead of EEPROM. Or battery backed RAM to buffer events, such may be built-in to your MCU already. \$\endgroup\$ Commented Apr 3 at 10:55
  • \$\begingroup\$ The battery backed idea would be nice but I am bound to the actual HW so it is not possible, MCU's internal FLASH is also almost full so it is neither an option. The best solution I find is to write back the events only when a whole page would be written, I mean, wait until I have the whole page to write to actually write it. This has the handicap that if a power down happens I lost the last four events, but it is what it is. \$\endgroup\$ Commented Apr 3 at 11:02
  • 1
    \$\begingroup\$ I designed a tool where the speed setting had to always be remembered. Then it's rather one write per second than once per hour. 114 years become 2 years. It all depends on the requirements of the specific application. \$\endgroup\$ Commented Apr 3 at 13:04

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.