summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* ARM: boards: kindle mx50: adapt MCI name to DTSAlexander Kurz2019-05-212-2/+2
| | | | | | | | The MCI instances got aliases in the DTS from linux upstream which changed the eMMC devicename e.g. from disk0 to mmc2. Adapt to this. Signed-off-by: Alexander Kurz <akurz@blala.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* ARM: include: dma: Zero out DMA coherent memory in no-MMU caseAndrey Smirnov2019-05-211-0/+2
| | | | | | | | Add code to match the behavior of dma_alloc_coherent() when CONFIG_MMU is selected. Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* Merge branch 'for-next/ppc'Sascha Hauer2019-05-1014-17/+20
|\
| * owc: remove references to GE.Barbier, Renaud2019-05-085-16/+19
| | | | | | | | | | | | | | | | As per contactual requirement, remove references to GE in the code. Signed-off-by: Renaud Barbier <renaud.barbier@abaco.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * owc: directories and files renamingBarbier, Renaud2019-05-0814-1/+1
| | | | | | | | | | | | | | | | | | As the company changed name to Abaco Systems Inc, we have a contractual requirement to remove GE references. Start by renaming files and directories using a neutral name. Signed-off-by: Renaud Barbier <renaud.barbier@abaco.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | Merge branch 'for-next/mvebu'Sascha Hauer2019-05-109-12/+97
|\ \
| * | ARM: mvebu: defconfig: updateSascha Hauer2019-05-071-11/+10
| | | | | | | | | | | | | | | | | | Enable remaining boards and enable some more features. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | arm: add basic support for the Armada XP DB platformSascha Hauer2019-05-078-1/+87
| | | | | | | | | | | | Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | | Merge branch 'for-next/mmc'Sascha Hauer2019-05-103-6/+9
|\ \ \
| * | | ARM: rpi3: remove swapped sdhci and sdhostLucas Stach2019-05-061-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a sdhost driver there is no need to swap those peripherals. As the sdhci peripheral is only used for SDIO, which isn't supported in Barebox disable it. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: rpi: enable sdhost driver in defconfigLucas Stach2019-05-061-1/+1
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: rpi: add clock for sdhostLucas Stach2019-05-061-0/+8
| |/ / | | | | | | | | | | | | | | | | | | The sdhost is driven by the core clock. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | | Merge branch 'for-next/misc'Sascha Hauer2019-05-101-17/+0
|\ \ \
| * | | ARM: socfpga: Cyclone5: remove watchdog_disable()Ian Abbott2019-04-231-17/+0
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | watchdog_disable() is left over from the original SoCFPGA commit and nothing calls it. Remove it to avoid a '-Wmissing-prototypes' warning from the compiler. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | | Merge branch 'for-next/mips'Sascha Hauer2019-05-1026-36/+351
|\ \ \
| * | | MIPS: remove request_sdram_region "fdt"Oleksij Rempel2019-04-231-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It is actually not needed at barebox runtime Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | MIPS: relocation: do not use configurable memory layoutOleksij Rempel2019-04-236-12/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The relocator is not able to patch properly new location of the stack. To make it work properly it is better to disable HAVE_CONFIGURABLE_MEMORY_LAYOUT. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | MIPS: relocation: add relocation supportOleksij Rempel2019-04-2310-7/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this patch i a port of following patch from u-boot with some additional integration changes and fixes of original code: | Subject: [PATCH] MIPS: Stop building position independent code | | U-Boot has up until now built with -fpic for the MIPS architecture, | producing position independent code which uses indirection through a | global offset table, making relocation fairly straightforward as it | simply involves patching up GOT entries. | | Using -fpic does however have some downsides. The biggest of these is | that generated code is bloated in various ways. For example, function | calls are indirected through the GOT & the t9 register: | | 8f998064 lw t9,-32668(gp) | 0320f809 jalr t9 | | Without -fpic the call is simply: | | 0f803f01 jal be00fc04 <puts> | | This is more compact & faster (due to the lack of the load & the | dependency the jump has on its result). It is also easier to read & | debug because the disassembly shows what function is being called, | rather than just an offset from gp which would then have to be looked up | in the ELF to discover the target function. | | Another disadvantage of -fpic is that each function begins with a | sequence to calculate the value of the gp register, for example: | | 3c1c0004 lui gp,0x4 | 279c3384 addiu gp,gp,13188 | 0399e021 addu gp,gp,t9 | | Without using -fpic this sequence no longer appears at the start of each | function, reducing code size considerably. | | This patch switches U-Boot from building with -fpic to building with | -fno-pic, in order to gain the benefits described above. The cost of | this is an extra step during the build process to extract relocation | data from the ELF & write it into a new .rel section in a compact | format, plus the added complexity of dealing with multiple types of | relocation rather than the single type that applied to the GOT. The | benefit is smaller, cleaner, more debuggable code. The relocate_code() | function is reimplemented in C to handle the new relocation scheme, | which also makes it easier to read & debug. | | Taking maltael_defconfig as an example the size of u-boot.bin built | using the Codescape MIPS 2016.05-06 toolchain (gcc 4.9.2, binutils | 2.24.90) shrinks from 254KiB to 224KiB. | | Signed-off-by: Paul Burton <paul.burton@imgtec.com> Signed-off-by: Oleksij Rempel <linux@rempel-privat.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | MIPS: relocation: pass ram size to pbl_main_entryOleksij Rempel2019-04-2312-13/+25
| |/ / | | | | | | | | | | | | | | | | | | | | | To make barebox dynamically relocatable it should know the RAM size to be able to calculate proper new location. Signed-off-by: Oleksij Rempel <linux@rempel-privat.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | | Merge branch 'for-next/layerscape'Sascha Hauer2019-05-1018-170/+365
|\ \ \
| * | | ARM: Layerscape: Add device tree compatible to image metadataSascha Hauer2019-05-102-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Enrich the image metadata with the device tree compatible string the image supports. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: Add environment and update handlersSascha Hauer2019-05-102-0/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The TQMLS1046a can boot from QSPI and SD/MMC. Add partitioning for these devices and barebox environment / barebox update handlers on them. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: Support booting from QSPISascha Hauer2019-05-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have to build correct images suitable for QSPI, thus have to call lspbl_spi_image instead of lspbl_image. In lowlevel code call the xload function which detects the bootsource rather than hardcoding SD/MMC. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: unify pbi filesSascha Hauer2019-05-102-33/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This unifies the two different pbi files. With our approach for QSPI booting differences in the pbi files are not necessary: - We do not do execute in place for QSPI, so we do not need different image execution addresses - Setting up the QSPI clock doesn't hurt even for SD boot Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: print life signs when debuggingSascha Hauer2019-05-101-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do the UART initialization after the SoC specific lowlevel setup and print the usual '>' when early debuging is enabled. To let this go out properly it seems we have to wait a small amount of time beforehand. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: Sync qspi RCW from TQ U-BootSascha Hauer2019-05-101-4/+4
| | | | | | | | | | | | | | | | Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: configure qspi dividerSascha Hauer2019-05-101-0/+3
| | | | | | | | | | | | | | | | Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: ls1046a: Add bbu handlersSascha Hauer2019-05-101-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The barebox images can simply be written to the partitions, so we can use bbu_register_std_file_update() for updating to MMC and QSPI. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: ls1046a: Add automatic bootsource detection xload functionSascha Hauer2019-05-103-1/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a helper function which continues booting from the detected boot source. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: Add QSPI boot supportSascha Hauer2019-05-103-0/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Booting Layerscape from QSPI is a bit tricky and the approach we take is different from the one U-Boot has taken, so it's worth writing and reading the following explanation. The QSPI controller can map the Flash contents into the memory space (On LS1046a at 0x40000000). The PBL unit uses this to read the RCW from this memory window. The Layerscape SoCs have a PowerPC history, so it seemed appropriate for the designers to let the QSPI controller operate in big endian mode by default. To let the SoC see the correct RCW we have to write the RCW and PBI data with be64 endianess. Our PBL image tool pokes the initial binary into the SoC internal SRAM using PBI data as done with SD/MMC boot aswell. barebox then changes the QSPI controller endianess to le64 to properly read the barebox binary (placed at an flash offset of 128KiB, so found in memory at 0x40020000) into SDRAM and jumps to it. U-Boot has another approach. Here the initial binary is executed in place directly at 0x40100000. This means the QSPI controller endianess must be swapped inside the PBI data. This has the effect that the whole RCW/PBI data must be 64bit endianess swapped *except* the very last word of the PBI data which contains the CRC command and is read already with changed endianess. As a conclusion when porting QSPI PBI files from U-Boot to barebox skip commands changing the endianess in the QSPI controller and make sure the image is executed in internal SRAM and not in the Flash memory window. Lines like this should be removed: 09550000 000f400c This sets the binary execution address: 09570604 40100000 For barebox it should be changed to 0x10000000. As a result the PBI files can probably be unified between SD and QSPI boot. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: ls1046a: Add bootsource detection supportSascha Hauer2019-05-103-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | Not much to do, there are only a few boot sources supported. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: defconfig: Enable more featuresSascha Hauer2019-05-091-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The TQMLS1046a has an i2c mux and a i2c gpio expander. Add support for it and also disable early debugging as these are for a single board only. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: Fix pinmux setup for i2c4Sascha Hauer2019-05-081-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | With this the I2C mux on i2c4 works properly. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: Unify SD and eMMC imagesSascha Hauer2019-05-082-91/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | TQ has unified SD and eMMC images in their U-Boot. Do the same in barebox aswell. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: Update device tree files from tq repositorySascha Hauer2019-05-082-25/+46
| | | | | | | | | | | | | | | | | | | | | | | | Update TQMLS1046a device tree files from TQ repository as of rocko.TQMLS1046A.BSP.SW.0002 Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: Use static DDR settingsSascha Hauer2019-05-081-14/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | TQ prefers static values in their U-Boot, so use these values in barebox aswell. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: Layerscape: TQMLS1046a: Set cpo_sample valueSascha Hauer2019-05-081-1/+1
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | Starting the board issues the warning: WARN: pls set popts->cpo_sample = 0x48 So set the value to the desired value. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | | Merge branch 'for-next/imx'Sascha Hauer2019-05-1010-33/+34
|\ \ \
| * | | ARM: zii-vf610-dev: Add ZII SSMB DTU boardAndrey Smirnov2019-04-264-2/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the Zodiac Digital Tapping Unit, a VF610 based network device with 5 Ethernet ports. Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: zii-imx8mq-dev: Drop unnecessary barrier() in switch statementAndrey Smirnov2019-04-231-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AArch64 uses PC-relative addressing instead of absolute one for data lookups, so compiling switch statement into a LUT shouldn't be a problem regardless if relocation happened or not. Disassembly of PBL code looks almost exactly the same with or without this workaround, so it is clearly not needed. Drop it. Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: Chris Healy <cphealy@gmail.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: zii-imx51-rdu1: Use -fno-tree-switch-conversion -fno-jump-tablesAndrey Smirnov2019-04-232-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original and very poor workaround no longer works against GCC8, so drop it and replace with a proper solution that should've been used in the first place - specifying -fno-tree-switch-conversion -fno-jump-tables as CFLAGS when building lowlevel.c Tested to work with: - GCC 8.2.1 (arm-none-eabi) - GCC 7.1.0 (arm-none-eabi) - GCC 4.8.4 (armv7l-timesys-linux-gnueabihf) Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: Chris Healy <cphealy@gmail.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: zii-vf610-dev: Use -fno-tree-switch-conversion -fno-jump-tablesAndrey Smirnov2019-04-232-17/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original and very poor workaround no longer works against GCC8, so drop it and replace with a proper solution that should've been used in the first place - specifying -fno-tree-switch-conversion -fno-jump-tables as CFLAGS when building lowlevel.c Tested to work with: - GCC 8.2.1 (arm-none-eabi) - GCC 7.1.0 (arm-none-eabi) - GCC 4.8.4 (armv7l-timesys-linux-gnueabihf) Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: Chris Healy <cphealy@gmail.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: imx8mq: link PCIE1 and PCIE2 power domainsLucas Stach2019-04-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Those two power domains can only be used together. The link between the two has been dropped with the dts merge of v5.1-rc1. Fix this. Fortunately the i.MX8MQ PCIe support will land in Linux 5.2, so we can drop all those PCIe related local DT overrides with the next big dts upstream sync. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | | ARM: i.MX25: Add some more clocksSteffen Trumtrar2019-04-121-0/+2
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | Add some clocks needed for: - RNGB - SCC - Dryice RTC Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | | Merge branch 'for-next/ctrlc'Sascha Hauer2019-05-101-1/+1
|\ \ \
| * | | Shell: Handle aborting loops betterSascha Hauer2019-04-241-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's easy to get stuck in an infinite loop in the hush shell: while true; do sleep 1; done The 'sleep' command will check for ctrl-c with the ctrlc() function. This will abort the sleep command. Hush then checks for ctrl-c again in the loop. The ctrl-c in the buffer has already been eaten by the sleep command, so the loop will continue. With this patch we remember the presence of a ctrl-c character in a variable instead of checking for a new character each time. The variable must be resetted explicitly by calling ctrlc_handled() which will be called by the shell in the outer loop. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | | Merge branch 'for-next/arm'Sascha Hauer2019-05-106-11/+80
|\ \ \ | |_|/ |/| |
| * | ARM: mmu: mark uncached regions as eXecute never on v7Ahmad Fatoum2019-04-293-9/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ARM Cortex-A Series Programmer's Guide notes[1]: > When set, the Execute Never (XN) bit in the translation table entry > prevents speculative instruction fetches taking place from desired > memory locations and will cause a prefetch abort to occur if execution > from the memory location is attempted. > > Typically device memory regions are marked as execute never to prevent > accidental execution from such locations, and to prevent undesirable > side-effects which might be caused by speculative instruction fetches. Heed the advice and mark uncached memory with the XN bit, when the CPU is >=v7. It's possible that there are SoCs that have a section shared between device memory and the on-chip RAM hosting the PBL. In such a section, every page except for the OCRAM's should be mapped XN, but as we know of no SoC with such an OCRAM layout, we ignore this possibility for now and let mmu_early_enable map sections only. [1]: 9.6.3 "Execute Never", Version 4.0 Suggested-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | ARM: mmu: remove doubly defined macroAhmad Fatoum2019-04-291-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | PMD_SECT_DEF_CACHED is defined along with PMD_SECT_DEF_UNCACHED in mmu.h, which is included two lines prior. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | ARM: cache-armv7: start invalidation from outer levelsAhmad Fatoum2019-04-291-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 25/4/19 11:57, Lucas Stach wrote: > [T]he sequence that could go wrong in Barebox is as follows: > 1. CPU core starts invalidating at L1 cache level > 2. HW prefetcher decides that a specific address should be brought into > the L1 cache. > 3. HW prefetcher finds a valid block for the requested address in L2 > cache and moves cached data from L2 to L1. > 4. Only now CPU core invalidates L2 cache. > > In the above sequence we now have invalid data in the L1 cache line. > The correct sequence will avoid this issue: > > 1. CPU core starts invalidating at L2 cache level > 2. HW prefetcher decides that a specific address should be brought into > the L1 cache. > 3. HW prefetcher sees invalid tags for the requested address in L2 > cache and brings in the data from external memory. > 4. CPU core invalidates L1 cache, discarding the prefetched data. > The ARM Cortex-A Series Programmer's Guide addresses this issue in the SMP-context[1]: > If another core were to access the affected address between those > two actions, a coherency problem can occur. Such problems can be avoided > by following two simple rules. > > * When cleaning, always clean the innermost (L1) cache first and then > clean the outer cache(s). > * When invalidating, always invalidate the outermost cache first and > the L1 cache last. The current code correctly iterates from inner to outer cache levels when flushing/cleaning (r8 == 0), invalidation (r8 == 1) occurs in the same direction though. Adjust the invalidation iteration order to start from the outermost cache instead. Equivalent C-Code: enum cache_op { CACHE_FLUSH = 0, CACHE_INVAL = 1 }; register enum cache_op operation asm("r8"); register int i asm("r12"); register int limit asm("r3") = max_cache_level << 1; // e.g. 4 with L2 max +if (operation == CACHE_FLUSH) { i = 0; +} else { + i = limit - 2; +} bool loop_again; do { /* [snip] */ + if (operation == CACHE_FLUSH) { i += 2; loop_again = limit > i; + } else { + loop_again = i > 0; + i -= 2; + } } while (loop_again); [1]: 18.6 "TLB and cache maintenance broadcast", Version 4.0 Suggested-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>