Since the database itself is usually about 100 times
larger than amount of main memory available on a machine, only a subset
of the database pages can fit in memory at any given time. (For an
antithetical view on this subject, read this.)
The buffer manager is used to control which pages are memory-resident.
Whenever a request is made (by higher layers of the database system,
like your Heapfile layer) for
a data page, the buffer manager must check to see whether the requested
page is already in the buffer pool. If so, the buffer manager simply returns
a pointer to the page. If not, the buffer manager frees a frame if
necessary (possibly
by writing to disk the page it contains, in case it is dirty) and then
reads in the requested page from disk into the frame that has been freed.
struct FileAndPage
{
File* file; // pointer to a file object
int pageNo; // page number within a file
}
The FileAndPage struct (along with a couple of member functions) is in the file buf.h
The following is the definition of the BufHashTbl class (given in buf.h).
class BufHashTbl
{
private:
typedef hash_map<FileAndPage, unsigned int> BufHashTable;
BufHashTable hashTable; // Hash table for locating
frame
public:
// insert entry into hash table mapping (file,pageNo)
to frameNo
const Status insert(File* file, const int pageNo,
const unsigned int frameNo);
// Check if (file,pageNo) is currently in the
buffer pool (ie. in
// the hash table) and return frameNo if found.
const Status lookup(File* file, const int pageNo,
unsigned int & frameNo) const;
// delete entry (file,pageNo) from hash table.
const Status remove(File* file, const int pageNo);
};
Notice that the BufHashTbl class definition that
we have supplied uses the hash_map class from the Standard Template
Library (STL) to implement the hash table that maps an instance of the
FileAndPage struct to an unsigned int (the frameNo).
const Status insert(File* file, const int pageNo, const unsigned int frameNo)
Inserts a new entry into the hash table mapping file and pageNo to frameNo. Returns HASHTBLERROR if the entry already exists and OK otherwise.
const Status lookup(File* file, const int pageNo, unsigned int & frameNo) const
Finds the frameNo that is obtained by hashing
file and pageNo. Returns HASHNOTFOUND if entry is not found and OK otherwise.
const Status remove(File* file, const int pageNo)
Removes the entry obtained by hashing file
and pageNo. Returns HASHTBLERROR if entry is not found, OK otherwise.
class BufDesc {
friend class BufMgr;
private:
File* file; // pointer to file object
int pageNo; // page within file
int pinCnt; // number of times this
page has been pinned
bool dirty; // true if dirty;
false otherwise
bool valid; // true if page is valid
bool refbit; // has this buffer frame
been referenced recently?
void Clear() { // initialize buffer frame for a new
user
pinCnt = 0;
file = NULL;
pageNo = -1;
dirty = refbit = false;
valid = false;
};
void Set(File* filePtr, int pageNum) {
file = filePtr;
pageNo = pageNum;
pinCnt = 1;
dirty = false;
refbit = true;
valid = true;
}
BufDesc() {
Clear();
}
};
First notice that all attributes of the BufDesc class are private and that the BufMgr class is defined to be a friend. While this may seem strange, this approach restricts access to only the BufMgr class. The alternative (making everything public) opens up access too far.
The purpose of most of the attributes of the BufDesc class should be
pretty obvious. The dirty bit, if true, indicates that the page
is dirty (i.e. has been updated) and thus must be written to disk before
the frame is used to hold another page. The pinCnt indicates how
many times the page has been pinned. The refbit is used by the
clock algorithm. The valid bit is used to indicate whether the
frame contains a valid page.
class BufMgr
{
private:
unsigned int clockHand; // clock hand
for clock algorithm
BufHashTbl hashTable; // hash
table mapping (File, page)
// to frame number
BufDesc *bufTable;
// vector of status info, 1 per page
unsigned int numBufs; // Number
of pages in buffer pool
BufStats bufStats;
// Statistics about buffer pool usage
const Status allocBuf(unsigned int & frame);
// allocate a free
// frame using the clock algorithm
public:
Page
*bufPool; // actual buffer pool
BufMgr(const unsigned int bufs);
~BufMgr();
const Status readPage(File* file, const int PageNo, Page*& page);
const Status unPinPage(File* file, const int PageNo,
const bool dirty);
const Status allocPage(File* file, int& PageNo, Page*&
page);
const Status flushFile(File* file);
const Status disposePage(File* file,
const int PageNo);
const BufStats & getBufStats() const // Get buffer pool
usage
{
return bufStats;
}
void clearBufStats()
{
bufStats.clear();
}
};
BufMgr(const unsigned int bufs)
Allocates a buffer pool with bufs page frames and
a corresponding BufDesc table. The way things are set up all frames will
be in the clear state when the buffer pool is allocated. The hash table
will also start out in an empty state.
~BufMgr()
Flushes out all dirty pages and deallocates the buffer pool and the BufDesc table.
const Status allocBuf(unsigned int & frame)
Allocates a free frame using the clock algorithm; if necessary, writes a dirty page back to disk. Returns BUFFEREXCEEDED if all buffer frames are pinned, UNIXERR if an OS file system error occurred when a dirty page was being written to disk and OK otherwise.
const Status readPage(File* file, const int PageNo, Page*& page)
If the page is already in the buffer pool,
sets the refbit, increments the pinCnt and then returns a pointer to the
buffer frame containing the page (via the page parameter). If the page
is not in the buffer pool, allocBuf() is called to allocate a buffer
frame. Then the method file->readPage() is called to read the page into the
buffer pool frame from the disk. Finally, Set() is invoked on
the frame to set it up properly.
Returns OK if no errors occurred, UNIXERR if an OS
error occurred, BUFFEREXCEEDED if all buffer frames are pinned, HASHTBLERROR
if a hash table error occurred.
const Status unPinPage(File* file, const int PageNo, const bool dirty)
Decrements the pinCnt of the frame containing (file, PageNo) and if dirty == true, sets the dirty bit.Returns OK if no errors occurred, HASHNOTFOUND if the page is not in the buffer pool hash table, PAGENOTPINNED if the pin count is already 0.
const Status allocPage(File* file, int& PageNo, Page*& page)
Allocates a new page in the specified file. Returns the page number of the page via the pageNo parameter and a pointer to the buffer frame allocated for the page via the page parameter. Returns OK if no errors occurred, UNIXERR if an OS error occurred, BUFFEREXCEEDED if all buffer frames are pinned and HASHTBLERROR if a hash table error occurred.
const Status disposePage(File* file, const int PageNo)
Deallocates the page denoted by PageNo from the file. The page may or may not currently exist in the buffer pool. If it exists, then the frame containing the page must be cleared. Returns OK if no errors occurred, UNIXERR if an OS error occurred, PAGEPINNED if the page is currently pinned in the buffer pool.
const Status flushFile(File* file)
This method will be called by DB::closeFile()
when all instances of a file have been closed (in which case all pages
of the file should have been unpinned). flushFile() should scan bufTable
and for each dirty page it should call file->writePage() to flush
the dirty page to disk. Note that if the page is not dirty, nothing needs
to be done. In other words some pages of the file might remain in the buffer
pool even after the file has been closed. These pages might become useful
if the file is opened again. Returns OK if no errors occurred and PAGEPINNED
if some page of the file is pinned.
struct BufStats
{
unsigned accesses; // Total number of
accesses to buffer pool
unsigned diskreads; // Number of pages read
from disk
unsigned diskwrites; // Number of pages written back
to disk
void clear()
{
accesses = diskreads = diskwrites
= 0;
}
BufStats()
{
clear();
}
};
The statistics can be retrieved using the getBufStats() and cleared using
clearBufStats().