summaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/cpu/intel_rdt_monitor.c
AgeCommit message (Collapse)AuthorFilesLines
2018-05-19x86/intel_rdt/mba_sc: Feedback loop to dynamically update mem bandwidthVikas Shivappa1-1/+125
mba_sc is a feedback loop where we periodically read MBM counters and try to restrict the bandwidth below a max value so the below is always true: "current bandwidth(cur_bw) < user specified bandwidth(user_bw)" The frequency of these checks is currently 1s and we just tag along the MBM overflow timer to do the updates. Doing it once in a second also makes the calculation of bandwidth easy. The steps of increase or decrease of bandwidth is the minimum granularity specified by the hardware. Although the MBA's goal is to restrict the bandwidth below a maximum, there may be a need to even increase the bandwidth. Since MBA controls the L2 external bandwidth where as MBM measures the L3 external bandwidth, we may end up restricting some rdtgroups unnecessarily. This may happen in the sequence where rdtgroup (set of jobs) had high "L3 <-> memory traffic" in initial phases -> mba_sc kicks in and reduced bandwidth percentage values -> but after some it has mostly "L2 <-> L3" traffic. In this scenario mba_sc increases the bandwidth percentage when there is lesser memory traffic. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1524263781-14267-7-git-send-email-vikas.shivappa@linux.intel.com
2018-05-19x86/intel_rdt/mba_sc: Prepare for feedback loopVikas Shivappa1-6/+38
This is a preparatory patch for the mba feedback loop. Add support to measure the "bandwidth in MBps" and the "delta bandwidth". Measure it by reading the MBM IA32_QM_CTR MSRs and calculating the amount of "bytes" moved. There is no user space interface for this and will only be used by the feedback loop patch. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: hpa@zytor.com Link: https://lkml.kernel.org/r/1524263781-14267-6-git-send-email-vikas.shivappa@linux.intel.com
2017-10-05x86/intel_rdt/cqm: Make integer rmid_limbo_count staticColin Ian King1-1/+1
rmid_limbo_count is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: symbol 'rmid_limbo_count' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: kernel-janitors@vger.kernel.org Link: https://lkml.kernel.org/r/20171002145931.27479-1-colin.king@canonical.com
2017-08-16x86/intel_rdt/cqm: Improve limbo list processingVikas Shivappa1-117/+93
During a mkdir, the entire limbo list is synchronously checked on each package for free RMIDs by sending IPIs. With a large number of RMIDs (SKL has 192) this creates a intolerable amount of work in IPIs. Replace the IPI based checking of the limbo list with asynchronous worker threads on each package which periodically scan the limbo list and move the RMIDs that have: llc_occupancy < threshold_occupancy on all packages to the free list. mkdir now returns -ENOSPC if the free list and the limbo list ere empty or returns -EBUSY if there are RMIDs on the limbo list and the free list is empty. Getting rid of the IPIs also simplifies the data structures and the serialization required for handling the lists. [ tglx: Rewrote changelog ... ] Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Link: http://lkml.kernel.org/r/1502845243-20454-3-git-send-email-vikas.shivappa@linux.intel.com
2017-08-16x86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplugVikas Shivappa1-2/+2
When a CPU is dying, the overflow worker is canceled and rescheduled on a different CPU in the same domain. But if the timer is already about to expire this essentially doubles the interval which might result in a non detected overflow. Cancel the overflow worker and reschedule it immediately on a different CPU in same domain. The work could be flushed as well, but that would reschedule it on the same CPU. [ tglx: Rewrote changelog once again ] Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Link: http://lkml.kernel.org/r/1502845243-20454-2-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01x86/intel_rdt/mbm: Handle counter overflowVikas Shivappa1-0/+63
Set up a delayed work queue for each domain that will read all the MBM counters of active RMIDs once per second to make sure that they don't wrap around between reads from users. [Tony: Added the initializations for the work structure and completed the patch] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-29-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01x86/intel_rdt/mbm: Add mbm counter initializationVikas Shivappa1-0/+7
MBM counters are monotonically increasing counts representing the total memory bytes at a particular time. In order to calculate total_bytes for an rdtgroup, we store the value of the counter when we create an rdtgroup or when a new domain comes online. When the total_bytes(all memory controller bytes) or local_bytes(local memory controller bytes) file in "mon_data" is read it shows the total bytes for that rdtgroup since its creation. User can snapshot this at different time intervals to obtain bytes/second. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-28-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01x86/intel_rdt/mbm: Basic counting of MBM events (total and local)Tony Luck1-1/+30
Check CPUID bits for whether each of the MBM events is supported. Allocate space for each RMID for each counter in each domain to save previous MSR counter value and running total of data. Create files in each of the monitor directories. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-27-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01x86/intel_rdt/cqm: Add mount,umount supportVikas Shivappa1-0/+1
Add monitoring support during mount and unmount. Since root directory is a "ctrl_mon" directory which can control and monitor resources create the "mon_groups" directory which can hold monitor groups and a "mon_data" directory which would hold all monitoring data like the rest of resource groups. The mount succeeds if either of monitoring or control/allocation is enabled. If only monitoring is enabled user can still create monitor groups under the "/sys/fs/resctrl/mon_groups/" and any mkdir under root would fail. If only control/allocation is enabled all of the monitoring related directories/files would not exist and resctrl would work in legacy mode. Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-23-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01x86/intel_rdt/cqm: Add mon_dataVikas Shivappa1-0/+49
Add a mon_data directory for the root rdtgroup and all other rdtgroups. The directory holds all of the monitored data for all domains and events of all resources being monitored. The mon_data itself has a list of directories in the format mon_<domain_name>_<domain_id>. Each of these subdirectories contain one file per event in the mode "0444". Reading the file displays a snapshot of the monitored data for the event the file represents. For ex, on a 2 socket Broadwell with llc_occupancy being monitored the mon_data contents look as below: $ ls /sys/fs/resctrl/p1/mon_data/ mon_L3_00 mon_L3_01 Each domain directory has one file per event: $ ls /sys/fs/resctrl/p1/mon_data/mon_L3_00/ llc_occupancy To read current llc_occupancy of ctrl_mon group p1 $ cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy 33789096 [This patch idea is based on Tony's sample patches to organise data in a per domain directory and have one file per event (and use the fp->priv to store mon data bits)] Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-20-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01x86/intel_rdt/cqm: Add RMID (Resource monitoring ID) managementVikas Shivappa1-0/+213
Hardware uses RMID(Resource monitoring ID) to keep track of each of the RDT events associated with tasks. The number of RMIDs is dependent on the SKU and is enumerated via CPUID. We add support to manage the RMIDs which include managing the RMID allocation and reading LLC occupancy for an RMID. RMID allocation is managed by keeping a free list which is initialized to all available RMIDs except for RMID 0 which is always reserved for root group. RMIDs goto a limbo list once they are freed since the RMIDs are still tagged to cache lines of the tasks which were using them - thereby still having some occupancy. They continue to be in limbo list until the occupancy < threshold_occupancy. The threshold_occupancy is a user configurable value. OS uses IA32_QM_CTR MSR to read the occupancy associated with an RMID after programming the IA32_EVENTSEL MSR with the RMID. [Tony: Improved limbo search] Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-10-git-send-email-vikas.shivappa@linux.intel.com
2017-08-01x86/intel_rdt/cqm: Add RDT monitoring initializationVikas Shivappa1-0/+161
Add common data structures for RDT resource monitoring and perform RDT monitoring related data structure initializations which include setting up the RMID(Resource monitoring ID) lists and event list which the resource supports. [ tony: some cleanup to make adding MBM easier later, remove "cqm" from some names, make some data structure local to intel_rdt_monitor.c static. Add copyright header] [ tglx: Made it readable ] Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-9-git-send-email-vikas.shivappa@linux.intel.com