(Read from here for position independent code PIC).
                        from u-boot : kernel_entry(0, machid, r2); arch/arm/lib/bootm.
                                      r0 = 0, r1 = machid, r2 = dtb pointer or atgs pointer.
2). Kernel entry point 
                bl      __lookup_processor_type
                bl      __create_page_tables
                ldr     r13, =__mmap_switched
                    ldr     r12, [r10, #PROCINFO_INITFUNC] 
                    add     r12, r12, r10
                    ret     r12            (branching to arch setup to initialize TLB, Cache, MMU). arch/arm/mm/proc-v7.S
                  b       __enable_mmu
                  b       __turn_mmu_on
                          mov     r3, r13 
                          ret     r3  (branch to virtual address stored in r13 i.e __mmap_switched).
                 __mmap_switched:       (arch/arm/kernel/head-common.S).
                    b       start_kernel   (init/main.c)
3). Register content before calling start_kernel :

137           * The processor init function will be called with:
138          *  r1 - machine type
139          *  r2 - boot data (atags/dt) pointer
140          *  r4 - translation table base (low word)
141          *  r5 - translation table base (high word, if LPAE)
142          *  r8 - translation table base 1 (pfn if LPAE)
143          *  r9 - cpuid
144          *  r13 - virtual address for __enable_mmu -> __turn_mmu_on
145          *
146          * On return, the CPU will be ready for the MMU to be turned on,
147          * r0 will hold the CPU control register value, r1, r2, r4, and
148          * r9 will be preserved.  r5 will also be preserved if LPAE.

4). Device-Tree: 
    device tree, also referred to as Open Firmware (abbreviated OF) or 
    Flattened Device Tree (FDT). This is essentially a data structure in byte 
    code format which contains information that is helpful to the kernel when 
    booting up.
5). The syntax to enable/disable interrupts is "CPS<IE/ID> <i/f/a>" for example:

CPSID   i   <-- mask IRQs
CPSIE   f   <-- unmask FIQs
CPSID   if  <-- mask IRQs and FIQs

CPS : Change processor states.

local_irq_disable();  ==> disabling irq i.e masking I bit of CPSR register on current CPU.

6). fdt_*() faimaly function will access flat device tree. of_*() family function will access tree 
    data structure after boot or unfalt device tree.

7). Bus addresses :  The addresses used between peripheral buses and memory. Often, they are the
    same as the physical addresses used by the processor, but that is not necessarily the case.
    Some architectures can provide an I/O memory management unit(IOMMU) that remaps addresses between 
    a bus and main memory.An IOMMU making a buffer scattered in memory appear contiguous to the device.
    Bus addresses are highly architecture dependent, of course.
8). Kernel logical addresses : The Linux kernel maps most of the virtual address space that belongs to 
    the kernel to perform 1:1 mapping with an offset of the first part of physical memory. but in virtual
    address "random" page to page mapping. The mapping there do not follow the 1:1 pattern of the logical 
    mapping area.
9). Scatter/gather mappings : Scatter/gather mappings are a special type of streaming DMA mapping.

10). The Cortex-A53 processor has the following features :
     --> In-order, eight stage pipeline.
     --> Lower power consumption from the use of hierarchical clock gating, power domains, and
         advanced retention modes.
     --> Increased dual-issue capability from duplication of execution resources and dual
         instruction decoders.
11). The following is a typical example of what software runs at each Exception level:
        EL0 Normal user applications.
        EL1 Operating system kernel typically described as privileged.
        EL2 Hypervisor.
        EL3 Low-level firmware, including the Secure Monitor.
        --> Hypervisor
                This runs at EL2, which is always Non-secure. The hypervisor, when present and
                enabled, provides virtualization services to rich OS kernels.

12). Execution states:
        The ARMv8 architecture defines two Execution States, AArch64 and AArch32.
        In AArch32 state, Trusted OS software executes in Secure EL3, and in AArch64 state it
        primarily executes in Secure EL1.
13). Movement between Exception levels follows these rules:
        • Moves to a higher Exception level, such as from EL0 to EL1, indicate increased software
          execution privilege.
        • An exception cannot be taken to a lower Exception level.
        • There is no exception handling at level EL0, exceptions must be handled at a higher
          Exception level.
        • Ending exception handling and returning to the previous Exception level is performed by
          executing the ERET instruction.
        • Returning from an exception can stay at the same Exception level or enter a lower
          Exception level. It cannot move to a higher Exception level.  
        • The security state does change with a change of Exception level, except when retuning
          from EL3 to a Non-secure state.
14). An AArch32 operating system cannot host a 64-bit application.
     --> You can only change execution state by changing Exception level. Taking an exception might
         change from AArch32 to AArch64, and returning from an exception may change from AArch64
         to AArch32.

    --> Code at EL3 cannot take an exception to a higher exception level, so cannot change execution
        state, except by going through a reset.
    --> an AArch64 hypervisor can host both AArch32 and AArch64 guest operating systems. However,  
        a 32-bit operating system cannot host a 64-bit application and a 32-bit hypervisor cannot 
        host a 64-bit guest operating system.
    --> Changing to AArch32 requires going from a higher to a lower Exception level. This is the
        result of exiting an exception handler by executing the ERET instruction.

    --> Changing to AArch64 requires going from a lower to a higher Exception level. The
        exception can be the result of an instruction execution or an external signal.
    NOTE: **A hypervisor or operating system executing in AArch64 state
            can support AArch32 operation at lower privilege levels. This means that an OS running in
            AArch64 can host both AArch32 and AArch64 applications.
15).U-BOOT init path.

=>             b       reset   : (arch/arm/lib/vectors.S) --> reset:    (arch/arm/cpu/armv7/start.S).
                                                            /* Allow the board to save important registers */
                                                                    b      save_boot_params
                                            ENTRY(save_boot_params) : (arch/arm/cpu/armv7/start.S)
                                            b       save_boot_params_ret
                                            save_boot_params_ret: (arch/arm/cpu/armv7/start.S).
                                            disable interrupts (FIQ and IRQ), also set the cpu to SVC32 mode,
                                            except if in HYP mode already.
                                            --> Setup CP15 barrier 
                                             To access the HSCTLR(System Control register) use:
                                            MRC p15,4,<Rt>,c1,c0,0 ; Read HSCTLR into Rt
                                            MCR p15,4,<Rt>,c1,c0,0 ; Write Rt to HSCTLR
                                            bit-5 for CP15 barrier enablement.
                                           --> CP15BEN, bit [5]
                                            System instruction memory barrier enable. Enables accesses to the DMB, DSB,
                                            and ISB System instructions in the (coproc==0b1111) encoding space from EL2:

                                            CP15BEN                    Meaning 
                                            0b0                        EL2 execution of the CP15DMB, CP15DSB, and CP15ISB 
                                                                    instructions is UNDEFINED.

                                            0b1                        EL2 execution of the CP15DMB, CP15DSB, and CP15ISB 
                                                                    instructions is enabled.
                                            CP15BEN is optional, but if it is implemented in the SCTLR then it must 
                                            also be implemented in the HSCTLR. If it is not implemented then this bit 
                                            is RAO/WI.In a system where the PE resets into EL2, this field resets 
                                            to an architecturally UNKNOWN value.        
                                                bl                 _main   (arch/arm/lib/crt0.S).
                                            (This file handles the target-independent stages of the U-Boot start-up where 
                                            a C runtime environment is needed. Its entry point
                                            is _main and is branched into from the target's start.S file.).
                                            bl                  board_init_f_mem (common/init/board_init.c).
                                                                (first c function).
                                                                This function will set memory for global data structure gd.
                                                                and initialize with zero.
                                                                and return to _main (arch/arm/lib/crt0.S).
                                            clear .bss section..
                                            bl           board_init_f (common/board_f.c).
                                            This function prepares the hardware for
                                            execution from system RAM (DRAM, DDR...)
                                                      initcall_run_list(init_sequence_f); (lib/initcall.c)
                                             Once board_init_f() function return to _main.
                                            b               relocate_code(arch/arm/lib/relocate.S).
                                           ldr     pc, =board_init_r       /* this is auto-relocated! */ (common/board_r.c).
                                                   Finally board_init_r will get called from _main.
                                                This environment has BSS (initialized to 0), initialized non-const
                                                data (initialized to their intended value), and stack in system
                                                RAM (for SPL moving the stack and GD into RAM is optional.
                                        NOTE: After calling this function /* we should not return to _main */
                                                    From this function only we will initialize every peripherals 
                                                    and go to main loop.
                                        we should not return to board_init_r() function as well.
                                        /*if (initcall_run_list(init_sequence_r))

                                        /* NOTREACHED - run_main_loop() does not return */
                                        static int run_main_loop(void)   (common/board_r.c).
                                                #ifdef CONFIG_SANDBOX
                                        /* main_loop() can return to retry autoboot, if so just run it again */
                                                    for (;;)
                                                    main_loop();    (common/main.c).
                                                    return 0;
    ldr     pc, _undefined_instruction
    ldr     pc, _software_interrupt
    ldr     pc, _prefetch_abort
    ldr     pc, _data_abort
    ldr     pc, _not_used
    ldr     pc, _irq
    ldr     pc, _fiq

16). How does two cpu comminicates.
     Most time they communicate via memory or nearest shared memory hierarchy level.
     Cores on same chip usually shares L2 or L3 cache. Cores on different chips communicate 
     via memory or with cache-to-cache interactions using cache coherency protocol.
17). The AArch64 execution state provides 31 × 64-bit general-purpose registers accessible at all
     times and in all Exception levels.(X0-X30).in 32-bit form W0-W30.
     In addition to the 31 core registers, there are also several special registers.
     -->Zero register.
     -->Program counter.
     -->Stack pointer.
     -->Program Status Register.
     -->Exception Link Register.
     In the ARMv8 architecture, when executing in AArch64, the exception return state is held in the
     following dedicated registers for each Exception level:
     -->Exception Link Register (ELR).
     -->Saved Processor State Register (SPSR).
18). Exception Link Register (ELR) : The Exception Link Register holds the exception return address.

19). Processor state: 
        AArch64 does not have a direct equivalent of the ARMv7 Current Program Status Register
        (CPSR). In AArch64, the components of the traditional CPSR are supplied as fields that can be
        made accessible independently. These are referred to collectively as Processor State (PSTATE).
20). The block is an abstraction of the filesystem—filesystems can be accessed only in multiples of a block.
     When a block is stored in memory—say, after a read or pending a write—it is stored in a buffer. Each 
     buffer is associated with exactly one block.
     ==> A single page can hold one or more blocks in memory.
     ==> each buffer is associated with a descriptor. The descriptor is called a buffer head and is
         of type struct buffer_head.The buffer_head structure holds all the information that the
         kernel needs to manipulate buffers.
    ==> the problem with buffer head was it was a large and unwieldy data structure and it was neither 
        clean nor simple to manipulate data in terms of buffer heads. Instead, the kernel prefers to work 
        in terms of pages, which are simple and enable for greater performance.A large buffer head describing 
        each individual buffer (which might be smaller than a page) was inefficient.
    ==> The second issue with buffer heads is that they describe only a single buffer.When used as the 
        container for all I/O operations, the buffer head forces the kernel to break up
        potentially large block I/O operations (say, a write) into multiple buffer_head structures.
        This results in needless overhead and space consumption.
        so after 2.5 onwards they introduced bio structure for all I/O operation.

21). The difference between buffer heads and the new bio structure is important.The bio
    structure represents an I/O operation, which may include one or more pages in memory.
    On the other hand, the buffer_head structure represents a single buffer, which describes
    a single block on the disk.
    ==> The bio structure can represent both normal page I/O and direct I/O (I/O operations
        that do not go through the page cache).
    ==> The concept of buffer heads is still required, however; buffer heads function as descriptors,
        mapping disk blocks to pages.The bio structure does not contain any information about the state 
        of a buffer—it is simply an array of vectors describing one or more segments of data for a single 
        block I/O operation, plus related information.
        struct bio {}; ==> blk_types.h

22). Block devices maintain request queues to store their pending block I/O requests.The request queue 
    is represented by the request_queue structure.The request queue contains a doubly linked list of 
    requests and associated control information. Requests are added to the queue by higher-level code 
    in the kernel, such as filesystems.
    ==> Each item in the queue’s request list is a single request, of type struct request.
    ==> Each request can be composed of more than one bio structure because individual requests can operate 
    on multiple consecutive disk blocks.Note that although the blocks on the disk must be adjacent, the blocks 
    in memory need not be; each bio structure can describe multiple segments (recall, segments are contiguous
    chunks of a block in memory) and the request can be composed of multiple bio structures.
23). Simply sending out requests to the block devices in the order that the kernel issues them, as soon as it issues 
    them, results in poor performance.kernel does not issue block I/O requests to the disk in the order they are 
    received or as soon as they are received. Instead, it performs operations called merging and sorting to greatly 
    improve the performance of the system as a whole.
    NOTE: The subsystem of the kernel that performs these operations is called the I/O scheduler.
24) I/O scheduler: It is divides the resource of disk I/O among the pending block I/O
    requests in the system. It does this through the merging and sorting of pending requests
    in the request queue.
    -> The Linus Elevator.
    -> The Deadline I/O Scheduler.
    -> The Anticipatory I/O Scheduler.
    -> The Complete Fair Queuing I/O Scheduler.
    -> The Noop I/O Scheduler.
    process scheduler: Divides the resource of the processor among the processes on the system.
25). Linux kernel implements a disk cache called the page cache.The goal of this cache is
     to minimize disk I/O by storing data in physical memory that would otherwise require
     disk access.page cache are propagated back to disk, which is called page writeback.
26). The page cache consists of physical pages in RAM, the contents of which correspond to
     physical blocks on a disk The size of the page cache is dynamic; it can grow to consume
     any free memory and shrink to relieve memory pressure.
27). Write Caching :
        --> no-write : data would be written directrly to disk.
        --> write-through : a write operation would automatically update both the in 
            memory cache and the on-disk file. This approach has the benefit of keeping 
            the cache coherent—synchronized.
        --> write-back : In a write-back cache, processes perform write operations directly 
            into the page cache.
            written-to pages in the page cache are marked as dirty and are added to a dirty list. 
            Periodically, pages in the dirty list are written back to disk in a process called writeback.

28). Cache Eviction :
        The final piece to caching is the process by which data is removed from the cache, either
        to make room for more relevant cache entries or to shrink the cache to make available
        more RAM for other uses.
        strategy for cache eviction :
        --> LRU
        --> Two list strategy.
29)    The Linux page cache uses a new object to manage entries in the cache and page I/O operations.
    That object is the address_space structure.
    struct address_space {...}.
    --> When a user process invokes the sync() and fsync() system calls, the kernel performs 
        writeback on demand.
    --> flusher thread is used to flush data to disk.


©2018 by memoryfaults.com. Proudly created with Wix.com